Research Article
A Multiphase Semistatic Training Method for Swarm Confrontation Using Multiagent Deep Reinforcement Learning
Table 5
The main parameters of training.
| Hyperparameters |
| Batch size | 2048 | Buffer size | 20480 | Learning rate | 3.0e − 05 | Beta | 0.01 | Epsilon | 0.2 | Lambda | 0.95 | Num epoch | 3 | Time horizon | 128 |
| Network setting | Reward signals | Hidden units | 512 | Num layers | 3 | Gamma | 0.99 | Strength | 1 |
|
|