Research Article
Event Driven Duty Cycling with Reinforcement Learning and Monte Carlo Technique for Wireless Network
Algorithm 2
Node evaluation based on RL.
| | Initialization; | | | while e is smaller than the number of total episodes do | | | while n is smaller than the maximum step do | | | Take action with ε-soft greedy: | | | | | | While the nodes are in RC phase do | | | Wake up the nodes decided from the RL process; | | | Generate data packets or receive data packets; | | | end while | | | Determine the subsequent state; | | | n = n+1; | | | Observe the delay and energy consumption; | | | Compute reward | | | | | | Store transition ) and τ in sample space Q; | | | update π, V(s), ε; | | | | | | | | | e = e + 1; | | | end while |
|