Research Article

AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization

Figure 7

The curves of train and test accuracy of VGG11 on CIFAR10 with respect to different values of learning rate. As can be seen, AdaCN is more robust to learning rate. (a) Train accuracy of AdaCN on CIFAR10. (b) Train accuracy of Apollo on CIFAR10. (c) Test accuracy of AdaCN on CIFAR10. (d) Test accuracy of Apollo on CIFAR10.
(a)
(b)
(c)
(d)