Research Article
Resource Scheduling in URLLC and eMBB Coexistence Based on Dynamic Selection Numerology
Algorithm 1
DQN-based resource allocation algorithm.
| 1 Initialize replay memory , capacity is | | 2 Initialize action-value function Q with random weights and random target Q with | | 3 For episode = 1, do | | 4 repeat | | 5 With probability select a random action , update numerology value , with probability select a random action | | 6 Execute action , observe reward and new station | | 7 Store in replay memory | | 8 Collect sample data from replay memory randomly | | 9 Update action-value function Q with limit | | 10 Every C steps reset | | 11 until S termination stat | | 12 End For | | 13 Return optimal strategy |
|