Information-Theoretic Generalization Bounds for Stochastic Gradient Descent

04/08/2021

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent

Gergely Neu, Gintare Karolina Dziugiate, Mahdi Haghifam, Daniel M. Roy

Keywords:

Abstract Paper Similar Papers

Abstract: We study the generalization properties of the popular stochastic optimization method known as stochastic gradient descent (SGD) for optimizing general non-convex loss functions. Our main contribution is providing upper bounds on the generalization error that depend on local statistics of the stochastic gradients evaluated along the path of iterates calculated by SGD. The key factors our bounds depend on are the variance of the gradients (with respect to the data istribution) and the local smoothness of the objective function along the SGD path, and the sensitivity of the loss function to perturbations to the final output. Our key technical tool is combining the information-theoretic generalization bounds previously used for analyzing randomized variants of SGD with a perturbation analysis of the iterates.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at COLT 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

Fast convergence of stochastic subgradient method under interpolation

Huang Fang, Zhenan Fan, Michael Friedlander

Keywords Paper

interpolation, stochastic subgradient method, convergence analysis, Optimization

0

0

0

0

4:42

09/07/2020

The estimation error of general first order methods

Michael V Celentano, Andrea Montanari, Yuchen Wu

Keywords Paper

High-dimensional statistics, Computational complexity, Matrix/tensor estimation, Regression

0

0

0

0

14:10

06/12/2020

Robustness Analysis of Non-Convex Stochastic Gradient Descent using Biased Expectations

Kevin Scaman, Cedric Malherbe

Keywords Paper

0

0

0

0

3:09

18/07/2021

Objective Bound Conditional Gaussian Process for Bayesian Optimization

Taewon Jeong, Heeyoung Kim

Keywords Paper

Probabilistic Methods, Gaussian Processes and Bayesian non-parametrics

0

0

0

0

5:12

06/12/2020

Sinkhorn Barycenter via Functional Gradient Descent

Zebang Shen, Zhenfu Wang, Alejandro Ribeiro, Hamed Hassani

Keywords Paper

0

0

0

1

3:14

04/08/2021

The Last-Iterate Convergence Rate of Optimistic Mirror Descent in Stochastic Variational Inequalities

Waïss Azizian, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos

Keywords Paper

0

0

0

0

16:03

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

13/04/2021

Kernel distributionally robust optimization: Generalized duality theorem and stochastic approximation

Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf

Keywords Paper

0

0

0

0

3:33

06/12/2020

A convex optimization formulation for multivariate regression

Yunzhang Zhu

Keywords Paper

0

0

0

0

3:23

12/07/2020

Stochastic Optimization for Regularized Wasserstein Estimators

Marin Ballu, Quentin Berthet, Francis Bach

Keywords Paper

Optimization - Convex

0

0

1

1

15:08

18/07/2021

Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent

Kangqiao Liu, Liu Ziyin, Masahito Ueda

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:18

12/07/2020

Optimal Bounds between f-Divergences and Integral Probability Metrics

Rohit Agrawal, Thibaut Horel

Keywords Paper

Learning Theory

0

0

0

0

13:49

18/07/2021

Fast Stochastic Bregman Gradient Methods: Sharp Analysis and Variance Reduction

Radu Alexandru Dragomir, Mathieu Even, Hadrien Hendrikx

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

5:22

13/04/2021

Nearest neighbour based estimates of gradients: Sharp nonasymptotic bounds and applications

Guillaume Ausset, Stephan Clémencon, François Portier

Keywords Paper

0

0

0

0

2:44

06/12/2021

On the Bias-Variance-Cost Tradeoff of Stochastic Optimization

Yifan Hu, Xin Chen, Niao He

Keywords Paper

theory, optimization, machine learning

0

0

0

0

14:56

06/12/2020

Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits

Arya Akhavan, Massimiliano Pontil, Alexandre Tsybakov

Keywords Paper

Reinforcement Learning and Planning -> Reinforcement Learning, Applications -> Privacy, Anonymity, and Security

0

0

0

0

3:00

06/12/2020

The Wasserstein Proximal Gradient Algorithm

Adil Salim, Anna Korba, Giulia Luise

Keywords Paper

0

0

0

0

3:14

26/04/2020

Gradient Descent Maximizes the Margin of Homogeneous Neural Networks

Kaifeng Lyu, Jian Li

Keywords Paper

margin, homogeneous, gradient descent

0

0

0

0

15:02

06/12/2020

Minimax Estimation of Conditional Moment Models

Nishanth Dikkala, Greg Lewis, Lester Mackey, Vasilis Syrgkanis

Keywords Paper

0

0

0

0

3:04

06/12/2021

Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction

Dominik Stöger, Mahdi Soltanolkotabi

Keywords Paper

optimization

0

0

0

0

14:11

13/04/2021

On the convergence of gradient descent in GANs: MMD GAN as a gradient flow

Youssef Mroueh, Truyen Nguyen

Keywords Paper

0

0

0

0

2:52

12/07/2020

High-dimensional Robust Mean Estimation via Gradient Descent

Yu Cheng, Ilias Diakonikolas, Rong Ge, Mahdi Soltanolkotabi

Keywords Paper

Learning Theory

0

0

0

0

16:15

06/12/2021

Label Noise SGD Provably Prefers Flat Global Minimizers

Alex Damian, Tengyu Ma, Jason Lee

Keywords Paper

optimization, machine learning

0

0

0

0

11:31

18/07/2021

Fast margin maximization via dual acceleration

Ziwei Ji, Nati Srebro, Matus Telgarsky

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

4:50

12/07/2020

The Performance Analysis of Generalized Margin Maximizers on Separable Data

Fariborz Salehi, Ehsan Abbasi, Babak Hassibi

Keywords Paper

Learning Theory

0

0

0

0

12:19

12/07/2020

Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle

Shaocong Ma, Yi Zhou

Keywords Paper

Optimization - Non-convex

0

0

0

0

13:33

06/12/2021

Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis

Jikai Jin, Bohang Zhang, Haiyang Wang, Liwei Wang

Keywords Paper

optimization

0

0

0

0

14:05

26/08/2020

Calibrated Surrogate Maximization of Linear-fractional Utility in Binary Classification

Han Bao, Masashi Sugiyama

Keywords Paper

0

0

0

0

15:01

03/05/2021

Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization

Chin-Wei Huang, Ricky T. Q. Chen, Christos Tsirigotis, Aaron Courville

Keywords Paper

convex optimization, Normalizing flows, universal approximation, optimal transport, invertible neural networks, variational inference, generative models

0

1

1

0

5:13

06/12/2020

Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping

Eduard Gorbunov, Marina Danilova, Alexander Gasnikov

Keywords Paper

0

0

0

0

3:15

12/07/2020

Stochastic Optimization for Non-convex Inf-Projection Problems

Yan Yan, Yi Xu, Lijun Zhang and
Wang Xiaoyu, Tianbao Yang

Keywords Paper

Optimization - Non-convex

0

0

0

0

14:13

13/04/2021

Asymptotics of ridge(less) regression under general source condition

Dominic Richards, Jaouad Mourtada, Lorenzo Rosasco

Keywords Paper

0

0

0

0

3:00

06/12/2021

Hessian Eigenspectra of More Realistic Nonlinear Models

Zhenyu Liao, Michael W Mahoney

Keywords Paper

theory, optimization, machine learning

0

0

0

0

15:49

06/12/2020

Distributionally Robust Parametric Maximum Likelihood Estimation

Viet Anh Nguyen, Xuhui Zhang, Jose Blanchet, Angelos Georghiou

Keywords Paper

0

0

0

0

3:15

13/04/2021

Direct loss minimization for sparse gaussian processes

Yadi Wei, Rishit Sheth, Roni Khardon

Keywords Paper

0

0

0

0

3:24

26/08/2020

A Rule for Gradient Estimator Selection, with an Application to Variational Inference

Tomas Geffner, Justin Domke

Keywords Paper

0

0

0

0

8:36

06/12/2021

A Note on Sparse Generalized Eigenvalue Problem

Yunfeng Cai, Guanhua Fang, Ping Li

Keywords Paper

0

0

0

0

14:13

06/12/2021

Slice Sampling Reparameterization Gradients

David M Zoltowski, Diana Cai, Ryan Adams

Keywords Paper

optimization, machine learning, generative model

0

0

0

0

14:43

26/08/2020

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

Kenji Kawaguchi, Haihao Lu

Keywords Paper

0

0

0

0

14:10

06/12/2020

Nonasymptotic Guarantees for Spiked Matrix Recovery with Generative Priors

Jorio Cocola, Paul Hand, Vlad Voroninski

Keywords Paper

0

0

0

0

3:15