Research Article

AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization

Table 5

Test accuracy of 1, 2, 3-layer LSTM on Penn Treebank dataset.

ModelAdamSGDAdaBoundApolloAdaCN

1-layer LSTM86.8686.1485.6884.9083.40
2-layer LSTM68.8168.5468.5869.7168.17
3-layer LSTM64.9164.5065.1866.3763.26

The best results are shown in bold.