Research Article

Adaptive Optimization of Traffic Signal Timing via Deep Reinforcement Learning

Table 3

Simulation environment hyperparameters.

ParameterMeaningValue

Discount factor0.99
Learning rate0.001
Clip range0.2
Every episode simulation time5000 s
The number of steps for update128
Entropy coefficient for the loss calculator0.01
Value function coefficient for the loss function0.5