Research Article
Exploration for Countering the Episodic Memory
Algorithm 1
Exploration for countering model-free episodic control.
| (1) | for episode = 1 to do | | (2) | for t = 1 to do | | (3) | Obtain observation from the environment | | (4) | Let | | (5) | Estimate and Q for each action a via (3) | | (6) | if Satisfy (4) then | | (7) | Choose | | (8) | else | | (9) | Choose | | (10) | end if | | (11) | Execute action , and receive reward | | (12) | end for | | (13) | for t = to 1 do | | (14) | Update and according to (2) | | (15) | end for | | (16) | end for |
|