Research Article
Joint Optimization of Jamming Link and Power Control in Communication Countermeasures: A Multiagent Deep Reinforcement Learning Approach
Table 3
Main hyper parameters of MASAC.
| Parameters | Value |
| | -0.25 | | 5 | | 1 | Training episodes () | 5000 | Total steps of each episode () | 500 | Soft updating rate () | 0.01 | Capacity of CRB | 217 | Minibatch size () | 256 | Initial value of entropy coefficient () | 1 | Discount factor () | 0.98 | Learning rate of policy network | 0.001 | Learning rate of evaluation network | 0.003 | Training threshold of CRB () | 1024 |
|
|