Research Article
Deep Reinforcement Learning-Based UAV Path Planning for Energy-Efficient Multitier Cooperative Computing in Wireless Sensor Networks
Algorithm 1
DQN algorithm for UAV path planning.
1: Randomly initialize predict network parameter and target network parameter . | 2: forepisode in range (Max_Iteration) do | 3: Initialize position and energy of UAV, position and energy of nodes, and task parameters. | 4: whiledo | 5: Get the current state from environment. | 6: Select UAV action with (27). | 7: Get reward with (26) and then the system transits to the next state . | 8: Store the record to the experience pool. | 9: Randomly sample records from the experience pool to form the minibatch. | 10: Compute the target value with (29). | 11: Update the parameters of the predicted Q network by minimizing the loss function (28). | 12: Update parameters by after a preset number of steps. | 13: end while | 14: end for |
|