Research Article

[Retracted] Gradient Descent Optimization in Deep Learning Model Training Based on Multistage and Method Combination Strategy

Table 5

Executing time of conflicting situations on the learning rate.

MethodsDecay1R1D2R2D3R3D5R5D

RMSprop100108109111113
Adam109122124126129
NAdam124142145146149
AdaMax115139141143147
Adadelta129162165168176
Adagrad128167170171173

Unit: s/epoch; kRkD: raise k times and decrease k times on learning rate for each iteration.