Research Article

Flipit Game Deception Strategy Selection Method Based on Deep Reinforcement Learning

Figure 11

Convergence performance of MFD-A2C and MFD-PPO algorithms.