Research Article
Unbiased Model-Agnostic Metalearning Algorithm for Learning Target-Driven Visual Navigation Policy
Algorithm 2
Global Model: metatraining phase.
ā | Require: and : step hyperparameters | (1) | Randomly initialize | (2) | | (3) | whiledo | (4) | Sample batch of tasks | (5) | for all do | (6) | Collect trajectories using in | (7) | Evaluate using equation (2) | (8) | Compute adapted parameters with gradient descent: | (9) | Collect trajectories using in | (10) | end for | (11) | Update using equation (2) | (12) | end while |
|