Mobile Information Systems

Research Article

Event Driven Duty Cycling with Reinforcement Learning and Monte Carlo Technique for Wireless Network

Node evaluation based on RL.

	Initialization;
	while e is smaller than the number of total episodes do
	while n is smaller than the maximum step do
	Take action with ε-soft greedy:

	While the nodes are in RC phase do
	Wake up the nodes decided from the RL process;
	Generate data packets or receive data packets;
	end while
	Determine the subsequent state;
	n = n+1;
	Observe the delay and energy consumption;
	Compute reward

	Store transition ) and τ in sample space Q;
	update π, V(s), ε;


	e = e + 1;
	end while