Research Article

An Exoatmospheric Homing Guidance Law Based on Deep Q Network

Table 3

DQN algorithm hyperparameter.

HyperparameterParameter value

Maximum iterations3000
Discount factor0.996
Q network learning rate0.001
Capacity of experience replay memory100000
Minibatch size64
Target network update rate10
Initial exploration0.8
Final exploration0.01
Reward coefficient0.05