The DQN-based cache content update algorithm. |
Input: The feature of the state |
Initialise the parameter and and instant reward =0 |
for step =1, Y do |
for, do |
Receive a content request |
if the content request is cached locally, then |
BS directly delivers the requested content to the user end epoch |
elif |
The cache capacity is not full, then |
BS retrieves the requested content from the core network and delivers the requested content to the user |
The requested content is cached locally end epoch |
elif |
The cache capacity is full, then |
observe the current state |
randomly generate a value |
if < , then |
randomly select an action from the action spaces |
else |
= argmax |
end if |
execute , receive the reward , next state |
store (, , , ) into the experience replay memory |
randomly selects a mini-batch of the experiences |
update the parameter of the evaluation via the minimisation of the backpropagated loss |
update the parameter of the evaluation in several time slots |
end if |
end for |
end for |