Research Article
An Efficient Resource Management Optimization Scheme for Internet of Vehicles in Edge Computing Environment
Algorithm 1
Resource management algorithm based on distributed reinforcement learning.
| Input: actor network, actor target network, critical network and critical target network, learning rate , discount rate , attenuation factor . | | Output: computing task offloading policy . | | Initialize the critical network parameter and actor network parameter . | | Initialize the status of experience playback pool and task vehicle | | Fordo | | Observe the environment status and select actions based on the current policy | | Execute the action , get the reward , and transfer to the state | | Save array to experience playback pool | | If the memory bank is full, but the stop condition is not met, a small batch of arrays is randomly sampled from the experience playback pool. | | Update critical network parameters, actor network parameters, and target network parameters | | End | | End |
|