Security and Communication Networks

Research Article

Network Security Defense Decision-Making Method Based on Stochastic Game and Deep Reinforcement Learning

Cyber adaptive defense countermeasures algorithm.

Input: ; learning rate ; reward discount factor ; exploration probability ; convergence accuracy ; stable duration ;
Output: optimal defense strategy .
Begin
(1)	Initialization:
(2)	Solve Bayesian Nash equilibrium:
(3)	Network defense strategy revenue function:
(4)	Get current network status:
(5)	repeat:
(6)	Select defensive actions through algorithm:
(7)	Output
(8)	Get a new network status:
(9)	Update and learn Q according to the phased results:
(10)	Update Bayesian Nash Equilibrium:
(11)
(12)
(13)	Until
(14)	Output
(15)	End