Research Article

Deep Reinforcement Learning-Based UAV Path Planning for Energy-Efficient Multitier Cooperative Computing in Wireless Sensor Networks

Algorithm 1

DQN algorithm for UAV path planning.
1: Randomly initialize predict network parameter and target network parameter .
2: forepisode in range (Max_Iteration) do
3: Initialize position and energy of UAV, position and energy of nodes, and task parameters.
4: whiledo
5:  Get the current state from environment.
6:  Select UAV action with (27).
7:  Get reward with (26) and then the system transits to the next state .
8:  Store the record to the experience pool.
9:  Randomly sample records from the experience pool to form the minibatch.
10: Compute the target value with (29).
11: Update the parameters of the predicted Q network by minimizing the loss function (28).
12: Update parameters by after a preset number of steps.
13: end while
14: end for