Mobile Information Systems

Research Article

Joint Optimization for MEC Computation Offloading and Resource Allocation in IoV Based on Deep Reinforcement Learning

Decentralized multiagent DDPG optimization method.

	Randomly initialize critic network and actor with weights and
	Initialize target network and with weights ,
	Initialize replay buffer
	for episode
	Initialize a random process foe action exploration
	Receive initial observation state

	for
	Select action according to the current policy and exploration noise
	Execute action and observe reward and observe the next state
	Store all transitions in
	Sample a random mini-batch of transitions from
	Set

	Update critic network by minimizing the loss

	Update the actor policy by using the sampled policy gradient

	Update the target networks for each agent :


	end for
	end for
	end for