Research Article

[Retracted] Gradient Descent Optimization in Deep Learning Model Training Based on Multistage and Method Combination Strategy

Table 8

Performance of the proposed method, SGD.

ResNet-20 on Cafri-10LSTM on IMDB
Val-lossVal-accVal-lossVal-acc

SGD + SGD1.01780.69480.69190.5570
SGD + (SGD + M)1.07630.71340.44080.7971
SGD + (SGD + d)0.96070.71680.68900.5777
SGD + (SGD + M + d)0.90400.75570.43530.7982
SGD + RMSprop0.94080.74190.42870.8367
SGD + (RMSprop + d)1.01310.72980.43420.8237
SGD + Adam0.87510.76410.92100.8100
SGD + (Adam + d)1.06920.72740.81720.8130

m: Momentum, D: decay by 1e 6 every iteration, and “()”: take methods at the same timepiece. The bold values represent the best results.