Research Article
Intercept Guidance of Maneuvering Targets with Deep Reinforcement Learning
Table 2
Hyperparameters for TD3 algorithm.
| Parameter | Value |
| Number of hidden layers | 2 | BATCH_SIZE | 32 | Replay buffer size | 50000 | Actor learning rate | 10-5 | Critic learning rate | | Policy noise | 0.2 | Noise bound | 0.5 | Soft update factor | 0.01 | Discounting factor γ | 0.95 | Delay steps | 5 | Gradient optimizer | Adam |
|
|