On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration

09/07/2020

On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration

Wenlong Mou, Chris Junchi Li, Martin Wainwright, Peter Bartlett, Michael Jordan

Keywords: Stochastic optimization, Concentration inequalities, Convex optimization, Reinforcement learning

Abstract Paper Similar Papers

Abstract: We undertake a precise study of the asymptotic and non-asymptotic properties of stochastic approximation procedures with Polyak-Ruppert averaging for solving a linear system $\bar{A} \theta = \bar{b}$. When the matrix $\bar{A}$ is Hurwitz, we prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity. The CLT characterizes the exact asymptotic covariance matrix, which is the sum of the classical Polyak-Ruppert covariance and a correction term that scales with the step size. Under assumptions on the tail of the noise distribution, we prove a non-asymptotic concentration inequality whose main term matches the covariance in CLT in any direction, up to universal constants. When the matrix $\bar{A}$ is not Hurwitz but only has non-negative real parts in its eigenvalues, we prove that the averaged LSA procedure actually achieves an $O(1/T)$ rate in mean-squared error. Our results provide a more refined understanding of linear stochastic approximation in both the asymptotic and non-asymptotic settings. We also show various applications of the main results, including the study of momentum-based stochastic gradient methods as well as temporal difference algorithms in reinforcement learning.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at COLT 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Robustness Analysis of Non-Convex Stochastic Gradient Descent using Biased Expectations

Kevin Scaman, Cedric Malherbe

Keywords Paper

0

0

0

0

3:09

18/07/2021

Optimal Estimation of High Dimensional Smooth Additive Function Based on Noisy Observations

Fan Zhou, Ping Li

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:47

06/12/2021

Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance

Hongjian Wang, Mert Gurbuzbalaban, Lingjiong Zhu and
Umut Simsekli, Murat Erdogdu

Keywords Paper

optimization

0

0

0

0

8:24

06/12/2020

Truncated Linear Regression in High Dimensions

Constantinos Daskalakis, Dhruv Rohatgi, Emmanouil Zampetakis

Keywords Paper

0

0

0

0

3:17

13/04/2021

Asymptotics of ridge(less) regression under general source condition

Dominic Richards, Jaouad Mourtada, Lorenzo Rosasco

Keywords Paper

0

0

0

0

3:00

18/07/2021

Fast Stochastic Bregman Gradient Methods: Sharp Analysis and Variance Reduction

Radu Alexandru Dragomir, Mathieu Even, Hadrien Hendrikx

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

5:22

22/06/2020

On the computability of continuous maximum entropy distributions with applications

Jonathan Leake, Nisheeth K. Vishnoi

Keywords Paper

Optimization, Entropy, Quantum entropy

0

0

0

0

23:59

06/12/2021

Label Noise SGD Provably Prefers Flat Global Minimizers

Alex Damian, Tengyu Ma, Jason Lee

Keywords Paper

optimization, machine learning

0

0

0

0

11:31

06/12/2020

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Raphaël Berthier, Francis Bach, Pierre Gaillard

Keywords Paper

Optimization -> Non-Convex Optimization, Deep Learning -> Optimization for Deep Networks

0

0

0

0

3:05

22/06/2020

Estimating normalizing constants for log-concave distributions: Algorithms and lower bounds

Rong Ge, Holden Lee, Jianfeng Lu

Keywords Paper

sampling algorithm, partition function, normalizing constant, log-concave distribution, multilevel Monte Carlo, Langevin Monte Carlo

0

0

0

0

26:11

06/12/2021

KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint Support

Pierre Glaser, Michael Arbel, Arthur Gretton

Keywords Paper

generative model, kernel methods, optimal transport

0

0

0

0

8:19

06/12/2021

Optimizing Information-theoretical Generalization Bound via Anisotropic Noise of SGLD

Bohan Wang, Huishuai Zhang, Jieyu Zhang and
Qi Meng, Wei Chen, Tie-Yan Liu

Keywords Paper

optimization

0

0

0

0

9:47

06/12/2021

Landscape analysis of an improved power method for tensor decomposition

Joe Kileel, Timo Klock, João M Pereira

Keywords Paper

optimization, robustness

0

0

0

0

12:05

09/07/2020

Asymptotic Errors for High-Dimensional Convex Penalized Linear Regression beyond Gaussian Matrices

Alia Abbara, Florent Krzakala, Cedric Gerbelot

Keywords Paper

Statistical physics, Convex optimization, High-dimensional statistics, Regression, Supervised learning

0

0

0

0

14:59

26/08/2020

Support recovery and sup-norm convergence rates for sparse pivotal estimation

Mathurin Massias, Quentin Bertrand, Alexandre Gramfort, Joseph Salmon

Keywords Paper

0

0

0

0

12:50

02/02/2021

On Convergence of Gradient Expected Sarsa(λ)

Long Yang, Gang Zheng, Yu Zhang and
Qian Zheng, Pengfei Li, Gang Pan

Keywords Paper

0

0

0

0

11:27

06/12/2021

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum

Prashant Khanduri, Siliang Zeng, Mingyi Hong and
Hoi-To Wai, Zhaoran Wang, Zhuoran Yang

Keywords Paper

optimization

0

0

0

0

9:47

04/08/2021

The Bethe and Sinkhorn Permanents of Low Rank Matrices and Implications for Profile Maximum Likelihood

Nima Anari, Moses Charikar, Kirankumar Shiragur, Aaron Sidford

Keywords Paper

0

0

0

0

18:20

18/07/2021

Principal Bit Analysis: Autoencoding with Schur-Concave Loss

Sourbh Bhadane, Aaron Wagner, Jayadev Acharya

Keywords Paper

Algorithms, Components Analysis (e.g., CCA, ICA, LDA, PCA)

0

0

0

0

5:11

03/05/2021

Implicit Normalizing Flows

Cheng Lu, Jianfei Chen, Chongxuan Li and
Qiuhao Wang, Jun Zhu

Keywords Paper

probabilistic inference, deep generative models, Normalizing flows, implicit functions

0

0

0

0

8:03

06/12/2020

Sinkhorn Barycenter via Functional Gradient Descent

Zebang Shen, Zhenfu Wang, Alejandro Ribeiro, Hamed Hassani

Keywords Paper

0

0

0

1

3:14

06/12/2020

All-or-nothing statistical and computational phase transitions in sparse spiked matrix estimation

jean barbier, Nicolas Macris, Cynthia Rush

Keywords Paper

0

0

0

0

3:25

06/12/2020

Distributionally Robust Parametric Maximum Likelihood Estimation

Viet Anh Nguyen, Xuhui Zhang, Jose Blanchet, Angelos Georghiou

Keywords Paper

0

0

0

0

3:15

12/07/2020

The Implicit Regularization of Stochastic Gradient Flow for Least Squares

Alnur Ali, Edgar Dobriban, Ryan Tibshirani

Keywords Paper

Supervised Learning

0

0

0

0

16:14

12/07/2020

Sparse Convex Optimization via Adaptively Regularized Hard Thresholding

Kyriakos Axiotis, Maxim Sviridenko

Keywords Paper

Optimization - General

0

0

0

0

13:44

04/08/2021

Fast Dimension Independent Private AdaGrad on Publicly Estimated Subspaces

Peter Kairouz, Monica Ribero Diaz, Keith Rush, Abhradeep Thakurta

Keywords Paper

0

0

0

0

14:52

04/08/2021

On the (asymptotic) convergence of Stochastic Gradient Descent and Stochastic Heavy Ball

Othmane Sebbouh, Robert M Gower, Aaron Defazio

Keywords Paper

0

0

0

0

15:29

06/12/2021

Implicit Regularization in Matrix Sensing via Mirror Descent

Fan Wu, Patrick Rebeschini

Keywords Paper

optimization

0

0

0

0

9:35

03/05/2021

On the Universality of the Double Descent Peak in Ridgeless Regression

David Holzmüller

Keywords Paper

Random Weights Neural Networks, Random Features, Linear Regression, Interpolation Peak, Double Descent

0

0

0

0

5:25

13/04/2021

On the convergence of gradient descent in GANs: MMD GAN as a gradient flow

Youssef Mroueh, Truyen Nguyen

Keywords Paper

0

0

0

0

2:52

12/07/2020

Accelerated Stochastic Gradient-free and Projection-free Methods

Feihu Huang, Lue Tao, Songcan Chen

Keywords Paper

Optimization - Non-convex

0

0

0

0

13:05

13/04/2021

Efficient statistics for sparse graphical models from truncated samples

Arnab Bhattacharyya, Rathin Desai, Sai Ganesh Nagarajan, Ioannis Panageas

Keywords Paper

0

0

0

0

2:56

06/12/2021

De-randomizing MCMC dynamics with the diffusion Stein operator

Zheyang Shen, Markus Heinonen, Samuel Kaski

Keywords Paper

optimization, generative model

0

0

0

0

13:42

26/08/2020

Langevin Monte Carlo without smoothness

Niladri Chatterji, Jelena Diakonikolas, Michael Jordan, Peter Bartlett

Keywords Paper

0

0

0

0

15:02

06/12/2021

Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

Tommaso d'Orsi, Chih-Hung Liu, Rajai Nasser and
Gleb Novikov, David Steurer, Stefan Tiegel

Keywords Paper

optimization

0

0

0

0

10:44

09/07/2020

Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo

Yin Tat Lee, Ruoqi Shen, Kevin Tian

Keywords Paper

Sampling algorithms, Bayesian methods

0

0

0

0

14:57

26/08/2020

Sparse and Low-rank Tensor Estimation via Cubic Sketchings

Botao Hao, Anru R. Zhang, Guang Cheng

Keywords Paper

0

0

0

0

13:02

18/07/2021

Consistent regression when oblivious outliers overwhelm

Tommaso d'Orsi, Gleb Novikov, David Steurer

Keywords Paper

Theory, Game Theory and Computational Economics, Theory, Theory, Computational Complexity

0

0

0

0

4:42

06/12/2020

Phase retrieval in high dimensions: Statistical and computational phase transitions

Antoine Maillard, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

Keywords Paper

0

0

0

0

3:36

06/12/2020

Penalized Langevin dynamics with vanishing penalty for smooth and log-concave targets

Avetik Karagulyan, Arnak Dalalyan

Keywords Paper

0

0

0

0

2:53