Research Article
Finite-Horizon Optimal Tracking Guidance for Aircraft Based on Approximate Dynamic Programming
Algorithm 1
Actor-critic learning procedure of tracking guidance.
| Input: | | Perturbation equations at every downrange step. | | Cost function along the trajectory and at final state. | | Output: | | Optimal control weights for tracking reference trajectory. | | (1) Randomly select sets of , calculate | | and , obtain by (23). | | (2) for to do | | (3) Initialize , , and actor training | | step . | | (4) repeat | | (5) Randomly select , apply previous control to calculate . | | (6) Substitute to (32) gives to . | | (7) Get the error of actor network from and . | | (8) Calculate the gradient of weights and update , and by (35). | | (9) Push training step . | | (10) until | | (11) Randomly select sets of , apply | | actor network to get . | | (12) Calculate , and according to Eqs. (26) and (27). | | (13) Apply least square estimate to get . | | (14) end for |
|