Research Article

AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization

Figure 5

The curves of test perplexity on Penn Treebank for 1, 2, 3-layer LSTM. Lower is better. (a) Test perplexity for 1-layer LSTM. (b) Test perplexity for 2-layer LSTM. (c) Test perplexity for 3-layer LSTM.
(a)
(b)
(c)