Research Article

UAV Path Planning Based on Multicritic-Delayed Deep Deterministic Policy Gradient

Table 1

The result of algorithms.

Learning stageExploiting stage
SuccessCollisionLossSuccessCollisionLoss

DDPG73.6%19.3%7.1%80.5%10.1%9.4%
TD378.5%17.1%4.4%88.4%5.6%6.0%
MCDDPG81.9%15.8%2.3%92.1%3.4%4.5%
MCD89.8%10.1%0.1%94.3%1.9%3.8%