Research Article
Count-Based Exploration via Embedded State Space for Deep Reinforcement Learning
Algorithm 1
Count-based exploration via embedded state space.
Initialize with entries drawn i.i.d. from the standard Gaussian distribution ; | Initialize a hash table with values ; | Initialize policy network with parameter and embedding network with parameter ; | for each iteration j do { | Collect a set of state-action samples with policy ; | Add the state samples to replay buffer; | ifthen { | Update the embedding network with loss function in Eq.(3). using samples drawn from the replay buffer; | } | Compute , the D-dim rounded hash code for learned by the embedding network; | Update the hash table counts as ; | Update the policy using rewards with any RL algorithm; | } |
|