Research Article

A Multiphase Semistatic Training Method for Swarm Confrontation Using Multiagent Deep Reinforcement Learning

Table 4

Group reward.

Win0.6 + 0.2(1-2ResetTimer/MaxEnvironmentSteps) +0.4GroupHP/45

Fail−1.0 + ResetTimer/MaxEnvironmentSteps
Destroy an enemy0.1
Tie (die at the same time)−0.2
Timeout−0.2