Research Article

Joint Optimization of Jamming Link and Power Control in Communication Countermeasures: A Multiagent Deep Reinforcement Learning Approach

Table 3

Main hyper parameters of MASAC.


Training episodes ()5000
Total steps of each episode ()500
Soft updating rate ()0.01
Capacity of CRB217
Minibatch size ()256
Initial value of entropy coefficient ()1
Discount factor ()0.98
Learning rate of policy network0.001
Learning rate of evaluation network0.003
Training threshold of CRB ()1024