Research Article

Learning to Drive in the NGSIM Simulator Using Proximal Policy Optimization

Figure 4

The change of the mean trajectory rewards.