An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias

Abstract: Structured non-convex learning problems, for which critical points have favorable statistical properties, arise frequently in statistical machine learning. Algorithmic convergence and statistical estimation rates are well-understood for such problems. However, quantifying the uncertainty associated with the underlying training algorithm is not well-studied in the non-convex setting. In order to address this shortcoming, in this work, we establish an asymptotic normality result for the constant step size stochastic gradient descent (SGD) algorithm---a widely used algorithm in practice. Specifically, based on the relationship between SGD and Markov Chains [DDB19], we show that the average of SGD iterates is asymptotically normally distributed around the expected value of their unique invariant distribution, as long as the non-convex and non-smooth objective function satisfies a dissipativity property. We also characterize the bias between this expected value and the critical points of the objective function under various local regularity conditions. Together, the above two results could be leveraged to construct confidence intervals for non-convex problems that are trained using the SGD algorithm.

13/04/2021

An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias

Lu Yu, Krishnakumar Balasubramanian, Stanislav Volgushev, Murat Erdogdu

Comments

Similar Papers

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Abstract Paper

Exponential convergence rates of classification errors on learning with SGD and random features

Shingo Yashima, Atsushi Nitanda, Taiji Suzuki

Keywords Abstract Paper

Generalization Guarantee of SGD for Pairwise Learning

Yunwen Lei, Mingrui Liu, Yiming Ying

Keywords Abstract Paper

optimization, machine learning

Stability and Generalization of Stochastic Gradient Methods for Minimax Problems

Yunwen Lei, Zhenhuan Yang, Tianbao Yang, Yiming Ying

Keywords Abstract Paper

Theory, Statistical Learning Theory

Robust Unsupervised Learning via L-statistic Minimization

Andreas Maurer, Daniela Angela Parletta, Andrea Paudice, Massimiliano Pontil

Keywords Abstract Paper

Theory, Statistical Learning Theory

Learning Near Optimal Policies with Low Inherent Bellman Error

Andrea Zanette, Alessandro Lazaric, Mykel Kochenderfer, Emma Brunskill

Keywords Abstract Paper

Reinforcement Learning - Theory

PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees

Jonas Rothfuss, Vincent Fortuin, Martin Josifoski, Andreas Krause

Keywords Abstract Paper

Algorithms, Multitask, Transfer, and Meta Learning

Generalization Bound of Gradient Descent for Non-Convex Metric Learning

MINGZHI DONG, Xiaochen Yang, Rui Zhu and Yujiang Wang, Jing-Hao Xue

Keywords Abstract Paper

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Abstract Paper

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Abstract Paper

Task-Robust Model-Agnostic Meta-Learning

Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Keywords Abstract Paper

Gradient descent follows the regularization path for general losses

Ziwei Ji, Miroslav Dudik, Robert Schapire, Matus Telgarsky

Keywords Abstract Paper

Loss functions, Classification, Convex optimization

What are the Statistical Limits of Offline RL with Linear Function Approximation?

Ruosong Wang, Dean Foster, Sham M Kakade

Keywords Abstract Paper

batch reinforcement learning, representation, function approximation, lower bound

On the Convergence of Step Decay Step-Size for Stochastic Optimization

Xiaoyu Wang, Sindri Magnússon, Mikael Johansson

Keywords Abstract Paper

deep learning, optimization, machine learning

A Second look at Exponential and Cosine Step Sizes: Simplicity, Adaptivity, and Performance

Xiaoyu Li, Zhenxun Zhuang, Francesco Orabona

Keywords Abstract Paper

Optimization, Non-Convex Optimization

Modeling the Second Player in Distributionally Robust Optimization

Paul Michel, Tatsunori Hashimoto, Graham Neubig

Keywords Abstract Paper

adversarial learning, deep learning, robustness, distributionally robust optimization

ReLU Regression with Massart Noise

Ilias Diakonikolas, Jong Ho Park, Christos Tzamos

Keywords Abstract Paper

Sparse Spectrum Warped Input Measures for Nonstationary Kernel Learning

Anthony Tompkins, Rafael Oliveira, Fabio Ramos

Keywords Abstract Paper

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Abstract Paper

optimization, machine learning

Estimating Principal Components under Adversarial Perturbations

Pranjal Awasthi, Xue Chen, Aravindan Vijayaraghavan

Keywords Abstract Paper

Unsupervised and semi-supervised learning, Adversarial learning and robustness

Linear Convergence in Federated Learning: Tackling Client Heterogeneity and Sparse Gradients

Aritra Mitra, Rayana Jaafar, George J. Pappas, Hamed Hassani

Keywords Abstract Paper

optimization, federated learning

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

MINGZHI DONG, Xiaochen Yang, Rui Zhu and
Yujiang Wang, Jing-Hao Xue

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Dheeraj Nagaraj, Xian Wu, Guy Bresler and
Prateek Jain, Praneeth Netrapalli

Keywords Paper

Keywords Paper

Keywords Paper

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

Ruiyi Zhang, Bo Dai, Lihong Li, Dale Schuurmans

Keywords Paper

Keywords Paper

Keywords Paper

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

Keywords Paper

Keywords Paper

Arushi Jain, Gandharv Patil, Ayush Jain and
Khimya Khetarpal, Doina Precup

Keywords Paper

Keywords Paper

Zhilei Wang, Pranjal Awasthi, Christoph Dann and
Ayush Sekhari, Claudio Gentile

Keywords Paper

Keywords Paper