Research Article
Stochastic Adaptive Forwarding Strategy Based on Deep Reinforcement Learning for Secure Mobile Video Communications in NDN
Algorithm 1
SAF-DRL for secure mobile video communications in NDN.
| 1Randomly initialize critic network , and actor network with random parameters , and respectively; | | (2) | Initialize target critic network , and target actor network with parameters , and respectively; | | (3)Initialize replay buffer ; | | (4)Initialize the forwarding probability of the out-interface , and = 0; | | (5)Receive the observed initial state ; /Decision Epoch/ | | (6)for = 1 do | | (7)Obtain the action by the current policy and exploration noise ; | | (8)Execute action , receive reward and observe next state ; | | (9)Store transition sample in ; /Training Transition Sampling/ | | (10)Sample a mini-batch of transition from ; | | (11)Execute action , where the ; | | (12)Compute the value of critic network: | | (13)Update the critic by minimizing the loss: | | (14)ifthen | | (15)Compute the actor update by policy gradient: | | | ; | | | /Target Network Update/ | | 16Update the parameters of the target networks: | | | ; | | ; | | 17 end if | | 18end for |
|