Research Article
Stabilizing Transmission Capacity in Millimeter Wave Links by Q-Learning-Based Scheme
Algorithm 2
The reward table update process for UE i.
| Run at the edge computing facility | | Input: the initialized reward table for UE i and the personalized information reported by all the UEs | | Output: the updated reward table for UE i | (1) | Find the SBS associated by UE i according to the personalized information reported by UE i | (2) | If there is not any SBS associated by UE i then | (3) | Determine the set of neighboring UEs according to the personalized information reported by UE i | (4) | For each neighboring UE i′ do | (5) | Determine its associating state and working state according to the personalized information reported by UE i′ | (6) | If UE i′ is both associated with an SBS and idle then | (7) | Record it as a candidate relaying UE of UE i and store it in the set Ri | (8) | End if | (9) | End for | (10) | Extract each candidate from the set Ri that is in the same coverage area as the UE i and then store it in the set SRi | (11) | If the set SRi is not empty then | (12) | Select the candidate with the highest energy reserve level from the set SRi, which is denoted as UE i′ and associated with SBS j | (13) | For each do | (14) | For each do | (15) | Determine , , and according to and | (16) | | (17) | = | (18) | End for | (19) | End for | (20) | Else if the set Ri is not empty then | (21) | Select the candidate with the highest energy reserve level from the set Ri, which is denoted as UE i′ and associated with SBS j′ | (22) | For each do | (23) | For each do | (24) | Determine , and according to and | (25) | | (26) | = | (27) | End for | (28) | End for | (29) | End if | (30) | End if |
|