Security and Communication Networks

Research Article

Stochastic Adaptive Forwarding Strategy Based on Deep Reinforcement Learning for Secure Mobile Video Communications in NDN

SAF-DRL for secure mobile video communications in NDN.

1Randomly initialize critic network , and actor network with random parameters , and respectively;
(2)	Initialize target critic network , and target actor network with parameters , and respectively;
(3)Initialize replay buffer ;
(4)Initialize the forwarding probability of the out-interface , and = 0;
(5)Receive the observed initial state ; /Decision Epoch/
(6)for = 1 do
(7)Obtain the action by the current policy and exploration noise ;
(8)Execute action , receive reward and observe next state ;
(9)Store transition sample in ; /Training Transition Sampling/
(10)Sample a mini-batch of transition from ;
(11)Execute action , where the ;
(12)Compute the value of critic network:
(13)Update the critic by minimizing the loss:
(14)ifthen
(15)Compute the actor update by policy gradient:
	;
	/Target Network Update/
16Update the parameters of the target networks:
	;
;
17 end if
18end for