Research Article
[Retracted] Gradient Descent Optimization in Deep Learning Model Training Based on Multistage and Method Combination Strategy
Table 4
Executing time of conflicting situations on learning rate.
| Methods | Decay | 1R1D | 2R2D | 3R3D | 5R5D |
| RMSprop | 64 | 69 | 70 | 71 | 72 | Adam | 70 | 78 | 79 | 81 | 83 | NAdam [29] | 79 | 91 | 92 | 93 | 95 | AdaMax [5] | 74 | 89 | 90 | 92 | 94 | Adadelta | 83 | 104 | 106 | 108 | 112 | Adagrad | 82 | 107 | 108 | 109 | 111 |
|
|
Unit: ms/batch; kRkD: raise k times and decrease k times on learning rate for each iteration.
|