Journal of Advanced Transportation

Research Article

Modeling Autonomous Vehicles’ Altruistic Behavior to Human-Driven Vehicles in the Car following Events and Impact Analysis

DDPG algorithm.

(1)	Randomly initialize critic network and actor with weights and
(2)	Initialize target network and with weights ,
(3)	Initialize replay buffer
(4)	For episode = 1, M do
(5)	Initialize a random process for action exploration
(6)	Receive initial observation state
(7)	For t = 1, T do
(8)	Select action according to current policy and exploration noise
(9)	Execute action and observe reward and observe new state
(10)	Store transition in
(11)	Sample a random minibatch of transitions from
(12)	Set
(13)	Update critic by minimizing loss:
(14)	Update actor policy using sampled policy gradient:
(15)	Update target networks:,
(16)	End for
(17)	End for