Research Article
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
| Require: //Mini-batch size | | Require: //Stepsize | | Require://Parameters of exponential moving average | | Require://Positive parameters | | Require://Initialize variables as zero vectors or zero matrices | | Require://Initialize timestep | (1) | While not converged do | (2) | | (3) | sample | (4) | //Stochastic gradient at timestep k | (5) | | (6) | | (7) | //Update diagonal Hessian | (8) | | (9) | | (10) | | (11) | | (12) | | (13) | | (14) | end while | (15) | return |
|