Research Article

Stabilizing Transmission Capacity in Millimeter Wave Links by Q-Learning-Based Scheme

Algorithm 2

The reward table update process for UE i.
Run at the edge computing facility
Input: the initialized reward table for UE i and the personalized information reported by all the UEs
Output: the updated reward table for UE i
(1)Find the SBS associated by UE i according to the personalized information reported by UE i
(2)If there is not any SBS associated by UE i then
(3) Determine the set of neighboring UEs according to the personalized information reported by UE i
(4)For each neighboring UE ido
(5)  Determine its associating state and working state according to the personalized information reported by UE i
(6)  If UE i′ is both associated with an SBS and idle then
(7)   Record it as a candidate relaying UE of UE i and store it in the set Ri
(8)  End if
(9)End for
(10) Extract each candidate from the set Ri that is in the same coverage area as the UE i and then store it in the set SRi
(11)If the set SRi is not empty then
(12)  Select the candidate with the highest energy reserve level from the set SRi, which is denoted as UE i′ and associated with SBS j
(13)  For each do
(14)   For each do
(15)    Determine , , and according to and
(16)    
(17)     = 
(18)   End for
(19)  End for
(20)Else if the set Ri is not empty then
(21)  Select the candidate with the highest energy reserve level from the set Ri, which is denoted as UE i′ and associated with SBS j
(22)  For each do
(23)   For each do
(24)    Determine , and according to and
(25)    
(26)     = 
(27)   End for
(28)  End for
(29)End if
(30)End if