Research Article
Multilane Spatiotemporal Trajectory Optimization Method (MSTTOM) for Connected Vehicles
| Q-learning algorithm | (1) | Initialize q (s, a); | (2) | While () | (3) | {Select the initial state s0 and action a0 according to the ε-greedy strategy; | (4) | While () | (5) | {Select the action at the state st according to the ε-greedy strategy, get reward rt and the next state st+1; | (6) | ; | (7) | ; } } | (8) | Get the optimal strategy |
|