Research Article
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
| | Require: //Mini-batch size | | | Require: //Stepsize | | | Require://Parameters of exponential moving average | | | Require://Positive parameters | | | Require://Initialize variables as zero vectors or zero matrices | | | Require://Initialize timestep | | (1) | While not converged do | | (2) | | | (3) | sample | | (4) | //Stochastic gradient at timestep k | | (5) | | | (6) | | | (7) | //Update diagonal Hessian | | (8) | | | (9) | | | (10) | | | (11) | | | (12) | | | (13) | | | (14) | end while | | (15) | return |
|