09/07/2020

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

Yossi Arjevani, Yair Carmon, John Duchi, Dylan Foster, Ayush Sekhari, Karthik Sridharan

Keywords: Non-convex optimization, Stochastic optimization

Abstract: We design an algorithm which finds an $\epsilon$-approximate stationary \n point (with $\|\nabla F(x)\|\le \epsilon$) using $O(\epsilon^{-3})$ \n stochastic gradient and Hessian-vector products, matching guarantees \n that were previously available only under a stronger assumption of access \n to multiple queries with the same random seed. \n We prove a lower bound which establishes that this rate\nis optimal and---surprisingly---that it cannot be improved using stochastic $p$th order methods\nfor any $p\ge2$, even when the first $p$ derivatives of the objective \nare Lipschitz. Together, these results characterize the complexity of\nnon-convex stochastic optimization with second-order methods and beyond.\nExpanding our scope to the oracle complexity of finding \n$(\epsilon,\gamma)$-approximate second-order stationary points, we \nestablish nearly matching upper and lower bounds for stochastic \nsecond-order methods. Our lower bounds here are novel even in the noiseless \ncase.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at COLT 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers