Wireless Communications and Mobile Computing

Research Article

Count-Based Exploration via Embedded State Space for Deep Reinforcement Learning

Count-based exploration via embedded state space.

Initialize with entries drawn i.i.d. from the standard Gaussian distribution ;
Initialize a hash table with values ;
Initialize policy network with parameter and embedding network with parameter ;
for each iteration j do {
Collect a set of state-action samples with policy ;
Add the state samples to replay buffer;
ifthen {
Update the embedding network with loss function in Eq.(3). using samples drawn from the replay buffer;
}
Compute , the D-dim rounded hash code for learned by the embedding network;
Update the hash table counts as ;
Update the policy using rewards with any RL algorithm;
}