Research Article
[Retracted] Gradient Descent Optimization in Deep Learning Model Training Based on Multistage and Method Combination Strategy
Table 15
Performance of the proposed method, Adam with decay.
| | ResNet-20 on Cafri-10 | LSTM on IMDB | Val-loss | Val-acc | Val-loss | Val-acc |
| (Adam + d) + SGD | 0.5980 | 0.8513 | 0.9323 | 0.8143 | (Adam + d) + (SGD + d) | 0.5959 | 0.8528 | 0.9291 | 0.8150 | (Adam + d) + (SGD + M) | 0.6745 | 0.8217 | 1.0951 | 0.8172 | (Adam + d) + (SGD + M + d) | 0.6960 | 0.8232 | 1.1128 | 0.8137 | (Adam + d) + RMSprop | 0.7657 | 0.8078 | 1.3052 | 0.7917 | (Adam + d) + (RMSprop + d) | 0.7760 | 0.8100 | 1.0837 | 0.8137 | (Adam + d) + Adam | 1.0136 | 0.7399 | 1.1720 | 0.8048 | (Adam + d) + (Adam + d) | 0.9641 | 0.7509 | 1.2429 | 0.8060 |
|
|
The bold values represent the best results.
|