Research Article

Stochastic Adaptive Forwarding Strategy Based on Deep Reinforcement Learning for Secure Mobile Video Communications in NDN

Algorithm 1

SAF-DRL for secure mobile video communications in NDN.
1Randomly initialize critic network , and actor network with random parameters , and respectively;
(2)Initialize target critic network , and target actor network with parameters , and respectively;
(3)Initialize replay buffer ;
(4)Initialize the forwarding probability of the out-interface , and  = 0;
(5)Receive the observed initial state ; /Decision Epoch/
(6)for  = 1 do
(7)Obtain the action by the current policy and exploration noise ;
(8)Execute action , receive reward and observe next state ;
(9)Store transition sample in ; /Training Transition Sampling/
(10)Sample a mini-batch of transition from ;
(11)Execute action , where the ;
(12)Compute the value of critic network: 
(13)Update the critic by minimizing the loss:
(14)ifthen
(15)Compute the actor update by policy gradient:
;
 /Target Network Update/
16Update the parameters of the target networks:
;
  ;
17 end if
18end for