Mobile Information Systems

Research Article

Stabilizing Transmission Capacity in Millimeter Wave Links by Q-Learning-Based Scheme

The Q table training process for UE i.

	Run at the cloud computing facility
	Input: , , the updated reward table for UE i
	Output: the trained Q table for UE i
(1)	Initialize each entry of Q table to 0
(2)	For each episode do
(3)	Randomly select an initial state
(4)	= 0
(5)	For each do
(6)	Compute according to formula (7)
(7)	Update the corresponding entry of Q table
(8)	If then
(9)
(10)
(11)	End if
(12)	End for
(13)	Determine the exploration probability (e.g., 0.1) based on exploration-exploitation policy
(14)	Generate a random number from 0 to 1
(15)	If then
(16)	If can transfer to the next state (e.g., ) then
(17)	and go to 4
(18)	End if
(19)	Else
(20)	Randomly select an action from
(21)	If the selected action can transfer to the next state then
(22)	and go to 4
(23)	End if
(24)	End if
(25)	End for