Joint Channel Allocation and Power Control Based on Long Short-Term Memory Deep Q Network in Cognitive Radio Networks
Algorithm 1
The joint design algorithm of LSTM-DQN.
(1)
Initialization: the capacity O of memory D, the transmit power of PU and SU is respectively, the channel interference matrix , LSTM-estimates LSTM-DQN Q weight , targets LSTM-DQN
(2)
For episode = 1 to E do
(3)
According to the initial state , SUs randomly select actions with probability, otherwise choose actions with probability
(4)
For t = 1 to T do
(5)
The PUs update the transmit power according to their own power control strategies
(6)
SUs select actions with probability, otherwise select the action
(7)
Obtain rewards and the next state
(8)
Save empirical data to memory D
(9)
Ifthen
(10)
Select training sample randomly from D
(11)
Calculate
(12)
Use the gradient descent method to minimize the loss function and update parameters