Research Article

Learning Attentional Communication with a Common Network for Multiagent Reinforcement Learning

Algorithm 1

Multiagent attentional communication with the common network.
(1)Initialize , , and common network
(2)Initialize experience replay and variable
(3)Initialize
(4)for do
(5) for do
(6)  Choose an action according to the greedy policy or new policy
(7)  Perform joint actions on the environment and then get a collective reward
(8)  Store samples
(9) end for
(10) Calculate the average score per episode during the test.
(11) Replace variable when variable is greater than and then update the common network with
(12) Put samples collected throughout the episode into experience replay
(13) for do
(14)  Batch sampling and calculate consensus information by formula (8)
(15)  Obtain of each agent, after the communication module
(16)  Calculate the target value
(17)  Calculate the loss by formula (6)
(18) end for
(19) Update the estimate network with a gradient descent step
(20) Replace the target network with , every epochs
(21)end for