Research Article
Learning Attentional Communication with a Common Network for Multiagent Reinforcement Learning
Table 1
Training parameters of the MAACCN algorithm.
| Parameters | Value | Descriptions |
| Lr | 0.0005 | The learning rate | Epsilon | 1 | Probability of exploration | Min_epsilon | 0.05 | Minimum probability of exploration | Anneal_steps | 50000 | The annealing steps of exploration | T_max | 2000000 | The total step size of training | N_episodes | 1 | The number of episodes sampled at an epoch | Evaluate_cycle | 100 | The interval of the evaluation cycle | Evaluate_epoch | 32 | Frequency of evaluation | Batch_size | 32 | The batch data size for training | Buffer_size | 5000 | The size of the buffer | Target_update_cycle | 200 | The update interval of the target network | hidden_dim | 64 | The dimension of a hidden layer | Head | 8 | The number of the multihead |
|
|