The Implicit Regularization of Stochastic Gradient Flow for Least Squares

12/07/2020

The Implicit Regularization of Stochastic Gradient Flow for Least Squares

Alnur Ali, Edgar Dobriban, Ryan Tibshirani

Keywords: Supervised Learning

Abstract Paper Similar Papers

Abstract: We study the implicit regularization of mini-batch stochastic gradient descent, when applied to the fundamental problem of least squares regression. We leverage a continuous-time stochastic differential equation having the same moments as stochastic gradient descent, which we call stochastic gradient flow. We give a bound on the excess risk of stochastic gradient flow at time $t$, over ridge regression with tuning parameter $\lambda = 1/t$. The bound may be computed from explicit constants (e.g., the mini-batch size, step size, number of iterations), revealing precisely how these quantities drive the excess risk. Numerical examples show the bound can be small, indicating a tight relationship between the two estimators. We give a similar result relating the coefficients of stochastic gradient flow and ridge. These results hold under no conditions on the data matrix $X$, and across the entire optimization path (not just at convergence).

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis

Jikai Jin, Bohang Zhang, Haiyang Wang, Liwei Wang

Keywords Paper

optimization

0

0

0

0

14:05

13/04/2021

Asymptotics of ridge(less) regression under general source condition

Dominic Richards, Jaouad Mourtada, Lorenzo Rosasco

Keywords Paper

0

0

0

0

3:00

06/12/2020

Robustness Analysis of Non-Convex Stochastic Gradient Descent using Biased Expectations

Kevin Scaman, Cedric Malherbe

Keywords Paper

0

0

0

0

3:09

13/04/2021

SGD for structured nonconvex functions: Learning rates, minibatching and interpolation

Robert Gower, Othmane Sebbouh, Nicolas Loizou

Keywords Paper

0

0

0

0

3:07

09/07/2020

On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration

Wenlong Mou, Chris Junchi Li, Martin Wainwright and
Peter Bartlett, Michael Jordan

Keywords Paper

Stochastic optimization, Concentration inequalities, Convex optimization, Reinforcement learning

0

0

0

0

15:04

06/12/2021

Regret Bounds for Gaussian-Process Optimization in Large Domains

Manuel Wuethrich, Bernhard Schölkopf, Andreas Krause

Keywords Paper

optimization, bandits, kernel methods

0

0

0

0

13:02

06/12/2020

Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy

Edward Moroshko, Blake Woodworth, Suriya Gunasekar and
Jason Lee, Nati Srebro, Daniel Soudry

Keywords Paper

0

0

0

0

3:19

06/12/2021

STORM+: Fully Adaptive SGD with Recursive Momentum for Nonconvex Optimization

Kfir Levy, Ali Kavis, Volkan Cevher

Keywords Paper

optimization

0

0

0

0

12:23

06/12/2021

Landscape analysis of an improved power method for tensor decomposition

Joe Kileel, Timo Klock, João M Pereira

Keywords Paper

optimization, robustness

0

0

0

0

12:05

13/04/2021

Evading the curse of dimensionality in unconstrained private GLMs

Shuang Song, Thomas Steinke, Om Thakkar, Abhradeep Thakurta

Keywords Paper

0

0

0

0

3:05

08/07/2020

On Skolem-hardness and saturation points in Markov decision processes

Jakob Piribauer and Christel Baier

Keywords Paper

Markov decision process, Skolem problem, stochastic shortest path, conditional expectation, conditional value-at-risk, model checking, frequency-LTL

0

0

0

0

26:13

06/12/2021

High Probability Complexity Bounds for Line Search Based on Stochastic Oracles

Billy Jin, Katya Scheinberg, Miaolan Xie

Keywords Paper

optimization

0

0

0

0

14:53

02/02/2021

Deep Conservation: A Latent-Dynamics Model for Exact Satisfaction of Physical Conservation Laws

Kookjin Lee, Kevin T. Carlberg

Keywords Paper

0

0

0

0

18:10

06/12/2020

One Ring to Rule Them All: Certifiably Robust Geometric Perception with Outliers

Heng Yang, Luca Carlone

Keywords Paper

0

0

0

0

3:24

18/07/2021

Regularized Submodular Maximization at Scale

Ehsan Kazemi, shervin minaee, Moran Feldman, Amin Karbasi

Keywords Paper

Optimization, Combinatorial Optimization

0

0

0

0

5:17

12/07/2020

Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions

Prashanth L.A., Krishna Jagannathan, Ravi Kolla

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

12:29

26/08/2020

Revisiting the Landscape of Matrix Factorization

Hossein Valavi, Sulin Liu, Peter Ramadge

Keywords Paper

0

0

0

0

12:11

06/12/2021

Submodular + Concave

Siddharth Mitra, Moran Feldman, Amin Karbasi

Keywords Paper

optimization

0

0

0

0

14:53

18/07/2021

Understanding the Dynamics of Gradient Flow in Overparameterized Linear models

Salma Tarmoun, Guilherme Franca, Benjamin Haeffele, Rene Vidal

Keywords Paper

Deep Learning, Optimization for Deep Networks

0

0

0

0

4:50

06/12/2020

Second Order Optimality in Decentralized Non-Convex Optimization via Perturbed Gradient Tracking

Isidoros Tziotis, Constantine Caramanis, Aryan Mokhtari

Keywords Paper

Applications -> Computer Vision; Deep Learning -> Deep Autoencoders; Deep Learning -> Generative Models; Probabilistic Methods , Applications

0

0

0

0

3:37

06/12/2020

Sinkhorn Barycenter via Functional Gradient Descent

Zebang Shen, Zhenfu Wang, Alejandro Ribeiro, Hamed Hassani

Keywords Paper

0

0

0

1

3:14

09/07/2020

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Alekh Agarwal, Sham Kakade, Jason Lee, Gaurav Mahajan

Keywords Paper

Reinforcement learning, Non-convex optimization

0

0

0

0

11:00

13/04/2021

A study of condition numbers for first-order optimization

Charles Guille-Escuret, Manuela Girotti, Baptiste Goujaud, Ioannis Mitliagkas

Keywords Paper

0

0

0

0

2:46

26/08/2020

Sparse and Low-rank Tensor Estimation via Cubic Sketchings

Botao Hao, Anru R. Zhang, Guang Cheng

Keywords Paper

0

0

0

0

13:02

04/08/2021

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

Difan Zou, Jingfeng Wu, Vladimir Braverman and
Quanquan Gu, Sham Kakade

Keywords Paper

0

0

0

0

18:27

26/08/2020

Convergence Rates of Gradient Descent and MM Algorithms for Bradley-Terry Models

Milan Vojnovic, Se-Young Yun, Kaifang Zhou

Keywords Paper

0

0

0

0

17:33

12/07/2020

Adaptive Gradient Descent without Descent

Konstantin Mishchenko, Yura Malitsky

Keywords Paper

Optimization - Convex

0

0

0

0

15:40

06/12/2021

The Benefits of Implicit Regularization from SGD in Least Squares Problems

Difan Zou, Jingfeng Wu, Vladimir Braverman and
Quanquan Gu, Dean Foster, Sham Kakade

Keywords Paper

optimization, machine learning

0

0

0

0

16:05

06/12/2021

High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails

Ashok Cutkosky, Harsh Mehta

Keywords Paper

deep learning, optimization

0

0

0

0

20:14

06/12/2021

Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints

Maura Pintor, Fabio Roli, Wieland Brendel, Battista Biggio

Keywords Paper

optimization, machine learning, robustness, adversarial robustness and security, vision

0

0

0

0

11:35

06/12/2020

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems

Luo Luo, Haishan Ye, Zhichao Huang, Tong Zhang

Keywords Paper

0

0

0

0

2:00

13/04/2021

On information gain and regret bounds in gaussian process bandits

Sattar Vakili, Kia Khezeli, Victor Picheny

Keywords Paper

0

0

0

0

2:52

06/12/2021

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum

Prashant Khanduri, Siliang Zeng, Mingyi Hong and
Hoi-To Wai, Zhaoran Wang, Zhuoran Yang

Keywords Paper

optimization

0

0

0

0

9:47

06/12/2020

Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning

Nathan Kallus, Angela Zhou

Keywords Paper

0

0

0

0

4:51

13/04/2021

vqSGD: Vector quantized stochastic gradient descent

Venkata Gandikota, Daniel Kane, Raj Kumar Maity, Arya Mazumdar

Keywords Paper

0

0

0

0

3:11

06/12/2021

Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance

Hongjian Wang, Mert Gurbuzbalaban, Lingjiong Zhu and
Umut Simsekli, Murat Erdogdu

Keywords Paper

optimization

0

0

0

0

8:24

12/07/2020

The Complexity of Finding Stationary Points with Stochastic Gradient Descent

Yoel Drori, Ohad Shamir

Keywords Paper

Optimization - Non-convex

0

0

0

0

9:09

08/07/2020

Fréchet Distance for Uncertain Curves

Kevin Buchin, Chenglin Fan, Maarten Löffler and
Aleksandr Popov, Benjamin Raichel and Marcel Roeloffzen

Keywords Paper

Curves, Uncertainty, Fréchet Distance, Hardness

0

0

0

0

21:45

04/08/2021

Softmax Policy Gradient Methods Can Take Exponential Time to Converge

Gen Li, Yuting Wei, Yuejie Chi and
Yuantao Gu, Yuxin Chen

Keywords Paper

0

0

0

0

15:15

04/08/2021

SGD Generalizes Better Than GD (And Regularization Doesn't Help)

Idan Amir, Tomer Koren, Roi Livni

Keywords Paper

0

0

0

0

15:53