Research Article

Count-Based Exploration via Embedded State Space for Deep Reinforcement Learning

Algorithm 1

Count-based exploration via embedded state space.
Initialize with entries drawn i.i.d. from the standard Gaussian distribution ;
Initialize a hash table with values ;
Initialize policy network with parameter and embedding network with parameter ;
for each iteration j do {
 Collect a set of state-action samples with policy ;
 Add the state samples to replay buffer;
ifthen {
  Update the embedding network with loss function in Eq.(3). using samples drawn from the replay buffer;
}
 Compute , the D-dim rounded hash code for learned by the embedding network;
 Update the hash table counts as ;
 Update the policy using rewards with any RL algorithm;
}