Research Article
Resource Allocation in UAV-Assisted Wireless Powered Communication Networks for Urban Monitoring
Algorithm 1
MJDDPG-based resource allocation algorithm.
Initialize weights for main network and target networks; | Initialize experience replay buffer, exploration variance and action exploration probability; | for to | for to | Update the environment state and observe the current state and record and ; | Based on selection action; | Execute action to update state and calculate reward value ; | Store the experience tuple into the experience buffer; | if buffer is full then | Randomly sampled mini-batch samples from the experience buffer; | Computing the target network values; | Updating the critic network by minimizing the loss of the critic network; | Updating the actor network by maximizing actor network losses; | Updating target network parameters; | Updating the action random parameters; | end if | end for | end for |
|