Research Article
A Deep Reinforcement Learning Approach to the Optimization of Data Center Task Scheduling
Table 1
Notations in the scheduling system model.
| | Notation | Memo | Type |
| | The duration of period of task scheduling | Model parameter | | The duration of period of resource optimization | Model parameter | | The priority function to estimate the priority of task i | Function | | The start time of task scheduling which also represents the ID of period of task scheduling (the period is also called time slot) | Variable | | The start time of resource optimization which also represents the ID of period of resource optimization | Variable | | The state, action, and reward vector for task scheduling agent | Variable | | The state, action, and reward vector for resource optimization agent | Variable | | , | Calibration parameters to adjust the influence of average task priority and active virtual machine proportion | Model parameter | | , | Calibration parameters to tune the proportion of the active virtual machine and the proportion of idle virtual machines | Model parameter | | , | The sum of the execution time of tasks arriving in period and the sum of the execution time of tasks not executed in period | Variable | | , | The number of tasks arriving in period and the number of tasks not executed in period | Variable | | M | The number of virtual machines in cloud server | Model parameter | | K | The ratio of to | Model parameter | | ,,, | The hyperparameter of A2C algorithm | Hyperparameter |
|
|