Research Article

Learning to Drive in the NGSIM Simulator Using Proximal Policy Optimization

Table 2

The hyperparameters.

SymbolsMeaningValues

The total training episodes200
The total simulation steps (batch size)2048
Learning rate0.0003
The number of repetitions of PPO training10
Minibatch size256