Computational Intelligence and Neuroscience

Research Article

Path Planning Algorithm for Unmanned Surface Vessel Based on Multiobjective Reinforcement Learning

Table 1

Related literature.


	Study title	Approach	Merit	Limitations	Ref

DRL based on value function	An improved algorithm of robot path planning in complex environment based on double DQN	Double DQN	The problem of lacking experiments is solved by redefining the initialization of the robot and the reward function for the free position	Slow convergence speed of the algorithm	[25]
	The USV path planning of dueling DQN algorithm based on tree sampling mechanism	Dueling DQN	The algorithm can identify and avoid static obstacles in the environment and realize autonomous navigation in complex environments	Internal connection between the state-action pairs is not strong enough	[26]
	Tactical UAV path optimization under radar threat using deep reinforcement learning	DQN-PER	Alleviates the sparse reward problem	Overvaluation of the action-state value	[27]

DRL based on strategy gradient	Advanced double layered multi-agent systems based on A3C in real-time path planning	A3C	The correlation between state distribution samples is eliminated, and the sample storage mode of experience playback mechanism is replaced	Convergence to local optimal strategy	[28]
	The path-planning algorithm of unmanned ship based on DDPG	DDPG	The algorithm can be applied to continuous state space and action space	Sensitive to hyperparameters	[29]
	Hindsight trust region policy optimization	TRPO	The algorithm can choose a more appropriate step length during training	Large environments and policies are prone to large errors	[30]
	PPO-based reinforcement learning for UAV navigation in urban environments	PPO	The algorithm has better data efficiency and robustness	The difference between the old and new policies cannot be too large with each update	[31]