Research Article

[Retracted] Gradient Descent Optimization in Deep Learning Model Training Based on Multistage and Method Combination Strategy

Table 4

Executing time of conflicting situations on learning rate.

MethodsDecay1R1D2R2D3R3D5R5D

RMSprop6469707172
Adam7078798183
NAdam [29]7991929395
AdaMax [5]7489909294
Adadelta83104106108112
Adagrad82107108109111

Unit: ms/batch; kRkD: raise k times and decrease k times on learning rate for each iteration.