AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization

<div>The curves of train and test accuracy of VGG11 on CIFAR10 with respect to different values of learning rate. As can be seen, AdaCN is more robust to learning rate. (a) Train accuracy of AdaCN on CIFAR10. (b) Train accuracy of Apollo on CIFAR10. (c) Test accuracy of AdaCN on CIFAR10. (d) Test accuracy of Apollo on CIFAR10.</div>

Computational Intelligence and Neuroscience

fig7

Figure 7

Figure 7: AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization