Research Article
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
Figure 5
The curves of test perplexity on Penn Treebank for 1, 2, 3-layer LSTM. Lower is better. (a) Test perplexity for 1-layer LSTM. (b) Test perplexity for 2-layer LSTM. (c) Test perplexity for 3-layer LSTM.
(a) |
(b) |
(c) |