Research Article
Stochastic Adaptive Forwarding Strategy Based on Deep Reinforcement Learning for Secure Mobile Video Communications in NDN
Algorithm 1
SAF-DRL for secure mobile video communications in NDN.
1Randomly initialize critic network , and actor network with random parameters , and respectively; | (2) | Initialize target critic network , and target actor network with parameters , and respectively; | (3)Initialize replay buffer ; | (4)Initialize the forwarding probability of the out-interface , and = 0; | (5)Receive the observed initial state ; /Decision Epoch/ | (6)for = 1 do | (7)Obtain the action by the current policy and exploration noise ; | (8)Execute action , receive reward and observe next state ; | (9)Store transition sample in ; /Training Transition Sampling/ | (10)Sample a mini-batch of transition from ; | (11)Execute action , where the ; | (12)Compute the value of critic network: | (13)Update the critic by minimizing the loss: | (14)ifthen | (15)Compute the actor update by policy gradient: | | ; | | /Target Network Update/ | 16Update the parameters of the target networks: | | ; | ; | 17 end if | 18end for |
|