Research Article

Path Planning Algorithm for Unmanned Surface Vessel Based on Multiobjective Reinforcement Learning

Table 1

Related literature.

 Study titleApproachMeritLimitationsRef

DRL based on value functionAn improved algorithm of robot path planning in complex environment based on double DQNDouble DQNThe problem of lacking experiments is solved by redefining the initialization of the robot and the reward function for the free positionSlow convergence speed of the algorithm[25]
The USV path planning of dueling DQN algorithm based on tree sampling mechanismDueling DQNThe algorithm can identify and avoid static obstacles in the environment and realize autonomous navigation in complex environmentsInternal connection between the state-action pairs is not strong enough[26]
Tactical UAV path optimization under radar threat using deep reinforcement learningDQN-PERAlleviates the sparse reward problemOvervaluation of the action-state value[27]

DRL based on strategy gradientAdvanced double layered multi-agent systems based on A3C in real-time path planningA3CThe correlation between state distribution samples is eliminated, and the sample storage mode of experience playback mechanism is replacedConvergence to local optimal strategy[28]
The path-planning algorithm of unmanned ship based on DDPGDDPGThe algorithm can be applied to continuous state space and action spaceSensitive to hyperparameters[29]
Hindsight trust region policy optimizationTRPOThe algorithm can choose a more appropriate step length during trainingLarge environments and policies are prone to large errors[30]
PPO-based reinforcement learning for UAV navigation in urban environmentsPPOThe algorithm has better data efficiency and robustnessThe difference between the old and new policies cannot be too large with each update[31]