On the Universality of the Double Descent Peak in Ridgeless Regression

03/05/2021

On the Universality of the Double Descent Peak in Ridgeless Regression

David Holzmüller

Keywords: Random Weights Neural Networks, Random Features, Linear Regression, Interpolation Peak, Double Descent

Abstract Paper Similar Papers

Abstract: We prove a non-asymptotic distribution-independent lower bound for the expected mean squared generalization error caused by label noise in ridgeless linear regression. Our lower bound generalizes a similar known result to the overparameterized (interpolating) regime. In contrast to most previous works, our analysis applies to a broad class of input distributions with almost surely full-rank feature matrices, which allows us to cover various types of deterministic or random feature maps. Our lower bound is asymptotically sharp and implies that in the presence of label noise, ridgeless linear regression does not perform well around the interpolation threshold for any of these feature maps. We analyze the imposed assumptions in detail and provide a theory for analytic (random) feature maps. Using this theory, we can show that our assumptions are satisfied for input distributions with a (Lebesgue) density and feature maps given by random deep neural networks with analytic activation functions like sigmoid, tanh, softplus or GELU. As further examples, we show that feature maps from random Fourier features and polynomial kernels also satisfy our assumptions. We complement our theory with further experimental and analytic results.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Truncated Linear Regression in High Dimensions

Constantinos Daskalakis, Dhruv Rohatgi, Emmanouil Zampetakis

Keywords Paper

0

0

0

0

3:17

06/12/2020

Distributionally Robust Parametric Maximum Likelihood Estimation

Viet Anh Nguyen, Xuhui Zhang, Jose Blanchet, Angelos Georghiou

Keywords Paper

0

0

0

0

3:15

13/04/2021

Asymptotics of ridge(less) regression under general source condition

Dominic Richards, Jaouad Mourtada, Lorenzo Rosasco

Keywords Paper

0

0

0

0

3:00

06/12/2020

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization

Benjamin Aubin, Florent Krzakala, Yue Lu, Lenka Zdeborová

Keywords Paper

0

0

0

0

3:08

03/05/2021

When does preconditioning help or hurt generalization?

Shun-ichi Amari, Jimmy Ba, Roger Grosse and
Chen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, Ji Xu

Keywords Paper

high-dimensional asymptotics, generalization, second-order optimization, natural gradient descent

0

0

0

0

5:21

18/07/2021

On the difficulty of unbiased alpha divergence minimization

Tomas Geffner, Justin Domke

Keywords Paper

Algorithms, Adversarial Learning, Deep Learning, Adversarial Networks, Probabilistic Methods, Approximate Inference

0

0

0

0

4:10

03/05/2021

Implicit Gradient Regularization

David Barrett, Benoit Dherin

Keywords Paper

regularization, theory, deep learning, implicit regularization, deep learning theory, theoretical issues in deep learning

0

0

0

0

4:55

13/04/2021

On multilevel monte carlo unbiased gradient estimation for deep latent variable models

Yuyang Shi, Rob Cornish

Keywords Paper

0

0

0

0

3:06

04/08/2021

Shape Matters: Understanding the Implicit Bias of the Noise Covariance

Jeff Z. HaoChen, Colin Wei, Jason Lee, Tengyu Ma

Keywords Paper

0

0

0

0

13:08

06/12/2021

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models

Phil Chen, Mikhal Itkina, Ransalu Senanayake, Mykel J Kochenderfer

Keywords Paper

deep learning, generative model

0

0

0

0

11:11

18/07/2021

Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models

Zitong Yang, Yu Bai, Song Mei

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:40

04/08/2021

Hypothesis testing with low-degree polynomials in the Morris class of exponential families

Dmitriy Kunisky

Keywords Paper

0

0

0

0

17:59

26/08/2020

Low-rank regularization and solution uniqueness in over-parameterized matrix sensing

Kelly Geyer, Anastasios Kyrillidis, Amir Kalev

Keywords Paper

0

0

0

0

7:36

06/12/2020

Autoencoders that don't overfit towards the Identity

Harald Steck

Keywords Paper

0

0

0

0

3:22

06/12/2021

Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression

Will Stephenson, Zachary Frangella, Madeleine Udell, Tamara Broderick

Keywords Paper

theory, optimization, interpretability

0

0

0

0

14:48

09/07/2020

Approximation Schemes for ReLU Regression

Ilias Diakonikolas, Surbhi Goel, Sushrut Karmalkar and
Adam Klivans, Mahdi Soltanolkotabi

Keywords Paper

PAC learning, Approximation algorithms, Convex optimization, Neural networks/deep learning

0

0

0

0

15:20

18/07/2021

Private Adaptive Gradient Methods for Convex Optimization

Hilal Asi, John Duchi, Alireza Fallah and
Omid Javidbakht, Kunal Talwar

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

5:24

18/07/2021

Representational aspects of depth and conditioning in normalizing flows

Frederic Koehler, Viraj Mehta, Andrej Risteski

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:23

13/04/2021

Fundamental limits of ridge-regularized empirical risk minimization in high dimensions

Hossein Taheri, Ramtin Pedarsani, Christos Thrampoulidis

Keywords Paper

0

0

0

0

3:33

04/08/2021

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

Difan Zou, Jingfeng Wu, Vladimir Braverman and
Quanquan Gu, Sham Kakade

Keywords Paper

0

0

0

0

18:27

06/12/2021

Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems

Suhas Kowshik, Dheeraj Nagaraj, Prateek Jain, Praneeth Netrapalli

Keywords Paper

theory

0

0

0

0

14:43

13/04/2021

Direct loss minimization for sparse gaussian processes

Yadi Wei, Rishit Sheth, Roni Khardon

Keywords Paper

0

0

0

0

3:24

26/08/2020

Regularized Autoencoders via Relaxed Injective Probability Flow

Abhishek Kumar, Ben Poole, Kevin Murphy

Keywords Paper

0

0

0

0

14:03

26/08/2020

A Unified Statistically Efficient Estimation Framework for Unnormalized Models

Masatoshi Uehara, Takafumi Kanamori, Takashi Takenouchi, Takeru Matsuda

Keywords Paper

0

0

0

0

13:58

18/07/2021

Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins

Spencer Frei, Yuan Cao, Quanquan Gu

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

16:05

18/11/2020

Robust deep ordinal regression under label noise

Bhanu Garg, Naresh Manwani

Keywords Paper

0

0

0

0

12:03

12/07/2020

Uncertainty quantification for nonconvex tensor completion: Confidence intervals, heteroscedasticity and optimality

Changxiao Cai, H. Vincent Poor, Yuxin Chen

Keywords Paper

Optimization - Non-convex

0

0

0

0

14:21

06/12/2021

Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction

Dominik Stöger, Mahdi Soltanolkotabi

Keywords Paper

optimization

0

0

0

0

14:11

06/12/2020

Overfitting Can Be Harmless for Basis Pursuit, But Only to a Degree

Peizhong Ju, Xiaojun Lin, Jia Liu

Keywords Paper

0

0

0

0

3:16

14/09/2020

Weak approximation of transformed stochastic gradient MCMC

Soma Yokoi, Takuma Otsuka, Issei Sat

Keywords Paper

0

0

0

0

13:39

26/08/2020

Lipschitz Continuous Autoencoders in Application to Anomaly Detection

Young-geun Kim, Yongchan Kwon, Hyunwoong Chang, Myunghee Cho Paik

Keywords Paper

0

0

0

0

10:24

06/12/2021

Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery

Lijun Ding, Liwei Jiang, Yudong Chen and
Qing Qu, Zhihui Zhu

Keywords Paper

0

0

0

0

14:02

18/07/2021

Consistent regression when oblivious outliers overwhelm

Tommaso d'Orsi, Gleb Novikov, David Steurer

Keywords Paper

Theory, Game Theory and Computational Economics, Theory, Theory, Computational Complexity

0

0

0

0

4:42

06/12/2021

Rectangular Flows for Manifold Learning

Anthony Caterini, Gabriel Loaiza-Ganem, Geoff Pleiss, John Cunningham

Keywords Paper

deep learning, optimization, generative model

0

0

0

0

12:26

03/05/2021

Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability

Suraj Srinivas, François Fleuret

Keywords Paper

Interpretability, saliency maps, score-matching

0

0

0

0

15:08

14/06/2020

Revisiting Saliency Metrics: Farthest-Neighbor Area Under Curve

Sen Jia, Neil D. B. Bruce

Keywords Paper

visual saliency, saliency metric, center bias, area under curve

0

0

0

0

4:50

06/12/2021

Misspecified Gaussian Process Bandit Optimization

Ilija Bogunovic, Andreas Krause

Keywords Paper

optimization, bandits, kernel methods

0

0

0

0

11:41

06/12/2021

Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

Tommaso d'Orsi, Chih-Hung Liu, Rajai Nasser and
Gleb Novikov, David Steurer, Stefan Tiegel

Keywords Paper

optimization

0

0

0

0

10:44

18/07/2021

Wasserstein Distributional Normalization For Robust Distributional Certification of Noisy Labeled Data

Sung Woo Park, Junseok Kwon

Keywords Paper

Deep Learning, Generative Models, Algorithms, Representation Learning; Optimization, Submodular Optimization, Probabilistic Methods, Robust statistics

0

0

0

0

5:20

09/07/2020

On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration

Wenlong Mou, Chris Junchi Li, Martin Wainwright and
Peter Bartlett, Michael Jordan

Keywords Paper

Stochastic optimization, Concentration inequalities, Convex optimization, Reinforcement learning

0

0

0

0

15:04