Research Article
An Exoatmospheric Homing Guidance Law Based on Deep Q Network
Table 3
DQN algorithm hyperparameter.
| | Hyperparameter | Parameter value |
| | Maximum iterations | 3000 | | Discount factor | 0.996 | | Q network learning rate | 0.001 | | Capacity of experience replay memory | 100000 | | Minibatch size | 64 | | Target network update rate | 10 | | Initial exploration | 0.8 | | Final exploration | 0.01 | | Reward coefficient | 0.05 |
|
|