On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings

12/07/2020

On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings

Mahmoud Assran, Michael Rabbat

Keywords: Optimization - Convex

Abstract Paper Similar Papers

Abstract: We study Nesterov's accelerated gradient method in the stochastic approximation setting (unbiased gradients with bounded variance) and the finite sum setting (where randomness is due to sampling mini-batches). To build better insight into the behavior of Nesterov's method in stochastic settings, we focus throughout on objectives that are smooth, strongly-convex, and twice continuously differentiable. In the stochastic approximation setting, Nesterov's method converges to a neighborhood of the optimal point at the same accelerated rate as in the deterministic setting. Perhaps surprisingly, in the finite-sum setting we prove that Nesterov's method may diverge with the usual choice of step-size and momentum, unless additional conditions on the problem related to conditioning and data coherence are satisfied. Our results shed light as to why Nesterov's method may fail to converge or achieve acceleration in the finite-sum setting.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

On Convergence of Gradient Expected Sarsa(λ)

Long Yang, Gang Zheng, Yu Zhang and
Qian Zheng, Pengfei Li, Gang Pan

Keywords Paper

0

0

0

0

11:27

06/12/2020

Stochastic Normalizing Flows

Hao Wu, Jonas Köhler, Frank Noe

Keywords Paper

0

0

0

0

3:19

06/12/2020

Quantized Variational Inference

Amir Dib

Keywords Paper

0

0

0

0

2:28

18/07/2021

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

Will Grathwohl, Kevin Swersky, Milad Hashemi and
David Duvenaud, Chris Maddison

Keywords Paper

Deep Learning, Generative Models

0

0

0

0

21:18

06/12/2021

Loss function based second-order Jensen inequality and its application to particle variational inference

Futoshi Futami, Tomoharu Iwata, naonori ueda and
Issei Sato, Masashi Sugiyama

Keywords Paper

optimization, generative model

0

0

0

0

14:09

06/12/2021

Entropy-based adaptive Hamiltonian Monte Carlo

Marcel Hirt, Michalis Titsias, Petros Dellaportas

Keywords Paper

generative model

0

0

0

0

5:40

06/12/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Paper

optimization, machine learning

0

0

0

0

14:51

18/07/2021

Instance-Optimal Compressed Sensing via Posterior Sampling

Ajil Jalal, Sushrut Karmalkar, Alex Dimakis, Eric Price

Keywords Paper

Algorithms, Sparsity and Compressed Sensing

0

0

0

0

5:26

06/12/2021

Differentiable Annealed Importance Sampling and the Perils of Gradient Noise

Guodong Zhang, Kyle Hsu, Jianing Li and
Chelsea Finn, Roger Grosse

Keywords Paper

optimization, generative model

0

0

0

0

15:30

06/12/2021

Structured Dropout Variational Inference for Bayesian Neural Networks

Son Nguyen, Duong Nguyen, Khai Nguyen and
Khoat Than, Hung Bui, Nhat Ho

Keywords Paper

deep learning, generative model

0

0

0

0

11:28

06/12/2021

Time-independent Generalization Bounds for SGLD in Non-convex Settings

Tyler Farghly, Patrick Rebeschini

Keywords Paper

optimization

0

0

0

0

9:07

18/07/2021

Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models

Zitong Yang, Yu Bai, Song Mei

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:40

06/12/2021

Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction

Dominik Stöger, Mahdi Soltanolkotabi

Keywords Paper

optimization

0

0

0

0

14:11

04/08/2021

Fast Rates for Structured Prediction

Vivien A Cabannnes, Francis Bach, Alessandro Rudi

Keywords Paper

0

0

0

0

16:17

06/12/2020

Outlier Robust Mean Estimation with Subgaussian Rates via Stability

Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia

Keywords Paper

0

0

0

0

3:19

18/07/2021

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient

Botao Hao, Yaqi Duan, Tor Lattimore and
Csaba Szepesvari, Mengdi Wang

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:20

18/07/2021

Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning

Yaqi Duan, Chi Jin, Zhiyuan Li

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:18

18/07/2021

Composing Normalizing Flows for Inverse Problems

Jay Whang, Erik Lindgren, Alex Dimakis

Keywords Paper

Algorithms, Sparsity and Compressed Sensing

0

0

0

0

5:07

06/12/2021

Exact Privacy Guarantees for Markov Chain Implementations of the Exponential Mechanism with Artificial Atoms

Jeremy Seeman, Matthew Reimherr, Aleksandra Slavković

Keywords Paper

theory, generative model, privacy

0

0

0

0

13:09

26/08/2020

Convergence Analysis of Block Coordinate Algorithms with Determinantal Sampling

Mojmir Mutny, Michal Derezinski, Andreas Krause

Keywords Paper

0

0

0

0

9:44

12/07/2020

Batch Stationary Distribution Estimation

Junfeng Wen, Bo Dai, Lihong Li, Dale Schuurmans

Keywords Paper

Probabilistic Inference - Approximate, Monte Carlo, and Spectral Methods

0

0

0

0

14:47

04/07/2020

A Batch Normalized Inference Network Keeps the KL Vanishing Away

Qile Zhu, Wei Bi, Xiaojiang Liu and
Xiyao Ma, Xiaolin Li, Dapeng Wu

Keywords Paper

amortized inference, language modeling, text classification, dialogue generation

0

0

0

0

11:16

06/12/2020

Stochastic Optimization for Performative Prediction

Celestine Mendler-Dünner, Juan Perdomo, Tijana Zrnic, Moritz Hardt

Keywords Paper

0

0

0

0

3:17

26/08/2020

Finite-Time Error Bounds for Biased Stochastic Approximation with Applications to Q-Learning

Gang Wang, Georgios B. Giannakis

Keywords Paper

0

0

0

0

14:03

06/12/2021

Continuous Latent Process Flows

Ruizhi Deng, Marcus Brubaker, Greg Mori, Andreas M Lehrmann

Keywords Paper

generative model

0

0

0

0

14:54

06/12/2021

Slice Sampling Reparameterization Gradients

David M Zoltowski, Diana Cai, Ryan Adams

Keywords Paper

optimization, machine learning, generative model

0

0

0

0

14:43

26/08/2020

Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models

Raaz Dwivedi, Nhat Ho, Koulik Khamaru and
Martin Wainwright, Michael Jordan, Bin Yu

Keywords Paper

0

0

0

0

15:08

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

06/12/2020

Random Reshuffling: Simple Analysis with Vast Improvements

Konstantin Mishchenko, Ahmed Khaled Ragab Bayoumi, Peter Richtarik

Keywords Paper

Reinforcement Learning and Planning -> Planning; Reinforcement Learning and Planning -> Reinforcement Learning, Reinforcement Learning and Planning

0

0

0

0

3:08

06/12/2020

Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning

Nathan Kallus, Angela Zhou

Keywords Paper

0

0

0

0

4:51

08/07/2020

Space-efficient Query Evaluation over Probabilistic Event Streams

Rajeev Alur, Yu Chen, Kishor Jothimurugan, Sanjeev Khanna

Keywords Paper

Query processing over streams, Streaming algorithms, Probabilistic streams

0

0

0

0

22:51

06/12/2021

Conformal Bayesian Computation

Edwin Fong, Chris C Holmes

Keywords Paper

machine learning

0

0

0

0

14:54

06/12/2020

Minimax Estimation of Conditional Moment Models

Nishanth Dikkala, Greg Lewis, Lester Mackey, Vasilis Syrgkanis

Keywords Paper

0

0

0

0

3:04

26/04/2020

Accelerating SGD with momentum for over-parameterized learning

Chaoyue Liu, Mikhail Belkin

Keywords Paper

SGD, acceleration, momentum, stochastic, over-parameterized, Nesterov

0

0

0

0

4:50

06/12/2021

Non-asymptotic convergence bounds for Wasserstein approximation using point clouds

Quentin Mérigot, Filippo Santambrogio, Clément SARRAZIN

Keywords Paper

optimization, machine learning, optimal transport

0

0

0

0

14:49

26/08/2020

Linear Convergence of Adaptive Stochastic Gradient Descent

Yuege Xie, Xiaoxia Wu, Rachel Ward

Keywords Paper

0

0

0

0

10:02

06/12/2021

On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms

Shuyu Cheng, Guoqiang Wu, Jun Zhu

Keywords Paper

optimization, reinforcement learning and planning, adversarial robustness and security

0

0

0

0

13:49

06/12/2021

An Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders

Xinmeng Huang, Kun Yuan, Xianghui Mao, Wotao Yin

Keywords Paper

optimization

0

0

0

0

14:39

09/07/2020

High probability guarantees for stochastic convex optimization

Damek Davis, Dmitriy Drusvyatskiy

Keywords Paper

Stochastic optimization, Computational complexity, Convex optimization, Excess risk bounds and generalization error bounds

0

0

0

0

15:10

06/12/2021

Robust Regression Revisited: Acceleration and Improved Estimation Rates

Arun Jambulapati, Jerry Li, Tselil Schramm, Kevin Tian

Keywords Paper

theory, optimization

0

0

0

0

14:22