Research Article

Modeling Autonomous Vehicles’ Altruistic Behavior to Human-Driven Vehicles in the Car following Events and Impact Analysis

Algorithm 1

DDPG algorithm.
(1)Randomly initialize critic network and actor with weights and
(2)Initialize target network and with weights ,
(3)Initialize replay buffer
(4)For episode = 1, M do
(5) Initialize a random process for action exploration
(6) Receive initial observation state
(7)For t = 1, T do
(8)  Select action according to current policy and exploration noise
(9)  Execute action and observe reward and observe new state
(10)   Store transition in
(11)   Sample a random minibatch of transitions from
(12)   Set
(13)   Update critic by minimizing loss:
(14)   Update actor policy using sampled policy gradient:
(15)  Update target networks:,
(16)End for
(17)End for