Research Article
Event Driven Duty Cycling with Reinforcement Learning and Monte Carlo Technique for Wireless Network
Algorithm 2
Node evaluation based on RL.
| Initialization; | | while e is smaller than the number of total episodes do | | while n is smaller than the maximum step do | | Take action with ε-soft greedy: | | | | While the nodes are in RC phase do | | Wake up the nodes decided from the RL process; | | Generate data packets or receive data packets; | | end while | | Determine the subsequent state; | | n = n+1; | | Observe the delay and energy consumption; | | Compute reward | | | | Store transition ) and τ in sample space Q; | | update π, V(s), ε; | | | | | | e = e + 1; | | end while |
|