Research Article

Diversity Evolutionary Policy Deep Reinforcement Learning

Table 3

The mean and standard deviation of the cumulative return per turn in different MuJoCo tasks.

TaskTD3Multiactor TD3CEMCEM-TD3DPERL

Hopper-v23025 ± 5773241 ± 3631054 ± 173652 ± 1163732 ± 106
HalfCheetah-v210002 ± 93010341 ± 5782298 ± 69010978 ± 75811615 ± 464
Ant-v23618 ± 4253881 ± 319845 ± 524037 ± 4664852 ± 317
Walker2d-v24399 ± 2384470 ± 301743 ± 2254612 ± 3575001 ± 562