Research Article

Stabilizing Transmission Capacity in Millimeter Wave Links by Q-Learning-Based Scheme

Algorithm 3

The Q table training process for UE i.
Run at the cloud computing facility
Input: , , the updated reward table for UE i
Output: the trained Q table for UE i
(1)Initialize each entry of Q table to 0
(2)For each episode do
(3) Randomly select an initial state
(4) = 0
(5)For each do
(6)  Compute according to formula (7)
(7)  Update the corresponding entry of Q table
(8)  If then
(9)   
(10)   
(11)  End if
(12)End for
(13) Determine the exploration probability (e.g., 0.1) based on exploration-exploitation policy
(14) Generate a random number from 0 to 1
(15)If then
(16)  If can transfer to the next state (e.g., ) then
(17)    and go to 4
(18)  End if
(19)Else
(20)  Randomly select an action from
(21)  If the selected action can transfer to the next state then
(22)    and go to 4
(23)  End if
(24)End if
(25)End for