Research Article

Exploration for Countering the Episodic Memory

Algorithm 2

Exploration for countering neural episodic control.
(1)Initialize replay memory
(2)Initialize a DND for each action a
(3)Initialize for horizon of the N-step Q rule
(4)for episode = 1 to do
(5)  for t = 1 to do
(6)   Obtain observation from the environment with embedding
(7)   Estimate for each action a via (2) from
(8)   if Satisfy (4) then
(9)    Choose
(10)   else
(11)    Choose
(12)   end if
(13)   Execute action , and receive reward
(14)   Append to
(15)   Append to
(16)   Train a random minibatch in
(17)  end for
(18)end for