Research Article

Multilane Spatiotemporal Trajectory Optimization Method (MSTTOM) for Connected Vehicles

Algorithm 1

Q-learning algorithm.
Q-learning algorithm
(1)Initialize q (s, a);
(2)While ()
(3){Select the initial state s0 and action a0 according to the ε-greedy strategy;
(4)While ()
(5){Select the action at the state st according to the ε-greedy strategy, get reward rt and the next state st+1;
(6);
(7); } }
(8)Get the optimal strategy