Research Article
Learning to Drive in the NGSIM Simulator Using Proximal Policy Optimization
| Symbols | Meaning | Values |
| | The total training episodes | 200 | | The total simulation steps (batch size) | 2048 | | Learning rate | 0.0003 | | The number of repetitions of PPO training | 10 | | Minibatch size | 256 |
|
|