Research Article
Joint Optimization of Jamming Link and Power Control in Communication Countermeasures: A Multiagent Deep Reinforcement Learning Approach
Table 3
Main hyper parameters of MASAC.
| | Parameters | Value |
| | -0.25 | | 5 | | 1 | | Training episodes () | 5000 | | Total steps of each episode () | 500 | | Soft updating rate () | 0.01 | | Capacity of CRB | 217 | | Minibatch size () | 256 | | Initial value of entropy coefficient () | 1 | | Discount factor () | 0.98 | | Learning rate of policy network | 0.001 | | Learning rate of evaluation network | 0.003 | | Training threshold of CRB () | 1024 |
|
|