Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks

12/07/2020

Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks

Blake Bordelon, Abdulkadir Canatar, Cengiz Pehlevan

Keywords: General Machine Learning Techniques

Abstract Paper Similar Papers

Abstract: A fundamental question in modern machine learning is how overparameterized deep neural networks can generalize. We address this question using 1) an equivalence between training infinitely wide neural networks and performing kernel regression with a deterministic kernel called the Neural Tangent Kernel (NTK), and 2) theoretical tools from statistical physics. We derive analytical expressions for learning curves for kernel regression, and use them to evaluate how the test loss of a trained neural network depends on the number of samples. Our approach allows us not only to compute the total test risk but also the decomposition of the risk due to different spectral components of the kernel. Complementary to recent results showing that during gradient descent, neural networks fit low frequency components first, we identify a new type of frequency principle: as the size of the training set size grows, kernel machines and neural networks begin to fit successively higher frequency modes of the target function. We verify our theory with simulations of kernel regression and training wide artificial neural networks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Out-of-Distribution Generalization in Kernel Regression

Abdulkadir Canatar, Blake Bordelon, Cengiz Pehlevan

Keywords Paper

theory, deep learning, machine learning

0

0

0

0

15:07

06/12/2021

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

Alexander Camuto, George Deligiannidis, Murat Erdogdu and
Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:36

06/12/2021

Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation

Kenneth Borup, Lars N Andersen

Keywords Paper

theory, deep learning, optimization

0

0

0

0

6:00

12/07/2020

Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime

Stéphane d'Ascoli, Maria Refinetti, Giulio Biroli, Florent Krzakala

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:11

18/07/2021

Efficient Statistical Tests: A Neural Tangent Kernel Approach

Sheng Jia, Ehsan Nezhadarya, Yuhuai Wu, Jimmy Ba

Keywords Paper

Deep Learning

0

0

0

0

5:13

06/12/2021

On the interplay between data structure and loss function in classification problems

Stéphane d'Ascoli, Marylou Gabrié, Levent Sagun, Giulio Biroli

Keywords Paper

deep learning, machine learning

0

0

0

0

8:59

20/07/2020

A type of generalization error induced by initialization in deep neural networks

Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Keywords Paper

0

0

0

0

17:33

03/05/2021

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime

Atsushi Nitanda, Taiji Suzuki

Keywords Paper

stochastic gradient descent, neural tangent kernel, over-parameterization, two-layer neural network

0

0

0

0

18:48

12/07/2020

Learning Deep Kernels for Non-Parametric Two-Sample Tests

Feng Liu, Wenkai Xu, Jie Lu and
Guangquan Zhang, Arthur Gretton, D.J. Sutherland

Keywords Paper

General Machine Learning Techniques

0

0

0

0

15:00

06/12/2020

A Statistical Mechanics Framework for Task-Agnostic Sample Design in Machine Learning

Bhavya Kailkhura, Jayaraman Thiagarajan, Qunwei Li and
Jize Zhang, Yi Zhou, Timo Bremer

Keywords Paper

0

0

0

0

3:21

09/07/2020

Kernel and Rich Regimes in Overparametrized Models

Blake E Woodworth, Suriya Gunasekar, Jason Lee and
Edward Moroshko, Pedro Henrique Pamplona Savarese, Itay Golan, Daniel Soudry, Nathan Srebro

Keywords Paper

Neural networks/deep learning,

0

0

0

0

13:29

06/12/2021

Robust Implicit Networks via Non-Euclidean Contractions

Saber Jafarpour, Alexander Davydov, Anton Proskurnikov, Francesco Bullo

Keywords Paper

theory, deep learning, machine learning, robustness, vision

0

0

0

0

14:59

12/07/2020

Frequency Bias in Neural Networks for Input of Non-Uniform Density

Ronen Basri, Meirav Galun, Amnon Geifman and
David Jacobs, Yoni Kasten, Shira Kritchman

Keywords Paper

Deep Learning - Theory

0

0

0

0

11:18

12/07/2020

The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization

Ben Adlam, Jeffrey Pennington

Keywords Paper

Deep Learning - Theory

0

0

0

0

14:24

03/05/2021

Dataset Meta-Learning from Kernel Ridge-Regression

Timothy Nguyen, Zhourong Chen, Jaehoon Lee

Keywords Paper

dataset corruption, infinite-width networks, neural kernels, kernel-ridge regression, dataset compression, dataset distillation, meta-learning

0

0

0

0

4:59

06/12/2021

Meta-Learning for Relative Density-Ratio Estimation

Atsutoshi Kumagai, Tomoharu Iwata, Yasuhiro Fujiwara

Keywords Paper

deep learning, machine learning, meta learning

0

0

0

0

8:56

03/05/2021

Entropic gradient descent algorithms and wide flat minima

Fabrizio Pittorino, Carlo Lucibello, Christoph Feinauer and
Gabriele Perugini, Carlo Baldassi, Elizaveta Demyanenko, Riccardo Zecchina

Keywords Paper

flat minima, belief-propagation, statistical physics, entropic algorithms

0

0

0

0

5:38

26/04/2020

Learning to Learn by Zeroth-Order Oracle

Yangjun Ruan, Yuanhao Xiong, Sashank Reddi and
Sanjiv Kumar, Cho-Jui Hsieh

Keywords Paper

learning to learn, zeroth-order optimization, black-box adversarial attack

0

0

0

0

4:48

06/12/2020

Exact expressions for double descent and implicit regularization via surrogate random design

Michal Derezinski, Feynman Liang, Michael W Mahoney

Keywords Paper

0

0

0

0

3:24

26/04/2020

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks

Yu Bai, Jason D. Lee

Keywords Paper

Neural Tangent Kernels, over-parametrized neural networks, deep learning theory

0

0

0

0

5:25

19/08/2021

Towards Understanding the Spectral Bias of Deep Learning

Yuan Cao, Zhiying Fang, Yue Wu and
Ding-Xuan Zhou, Quanquan Gu

Keywords Paper

Machine Learning, Deep Learning, Kernel Methods

0

0

0

0

14:42

06/12/2020

Dual-Free Stochastic Decentralized Optimization with Variance Reduction

Hadrien Hendrikx, Francis Bach, Laurent Massoulié

Keywords Paper

0

0

0

0

3:28

26/04/2020

Frequency-based Search-control in Dyna

Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand

Keywords Paper

Model-based reinforcement learning, search-control, Dyna, frequency of a signal

0

0

0

0

4:32

18/07/2021

Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed

Maria Refinetti, Sebastian Goldt, FLORENT KRZAKALA, Lenka Zdeborova

Keywords Paper

Theory, Models of Learning and Generalization

0

0

0

0

4:24

12/07/2020

Learning Similarity Metrics for Numerical Simulations

Georg Kohl, Kiwon Um, Nils Thuerey

Keywords Paper

General Machine Learning Techniques

0

0

0

0

15:16

03/05/2021

Understanding the effects of data parallelism and sparsity on neural network training

Namhoon Lee, Thalaiyasingam Ajanthan, Philip Torr, Martin Jaggi

Keywords Paper

sparsity, neural network training, data parallelism

0

0

0

0

4:52

06/12/2021

On the Role of Optimization in Double Descent: A Least Squares Study

Ilja Kuzborskij, Csaba Szepesvari, Omar Rivasplata and
Amal Rannen-Triki, Razvan Pascanu

Keywords Paper

theory, deep learning, optimization

0

0

0

0

13:48

13/04/2021

Learning with gradient descent and weakly convex losses

Dominic Richards, Mike Rabbat

Keywords Paper

0

0

0

0

3:20

12/07/2020

Associative Memory in Iterated Overparameterized Sigmoid Autoencoders

Yibo Jiang, Cengiz Pehlevan

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

13:37

03/05/2021

Few-Shot Bayesian Optimization with Deep Kernel Surrogates

Martin Wistuba, Josif Grabocka

Keywords Paper

automl, bayesian optimization, metalearning, few-shot learning

0

0

0

0

5:18

06/12/2021

Model, sample, and epoch-wise descents: exact solution of gradient flow in the random feature model

Antoine Bodin, Nicolas Macris

Keywords Paper

deep learning, optimization

0

0

0

0

15:00

06/12/2020

Matrix Inference and Estimation in Multi-Layer Models

Parthe Pandit, Moji Sahraee Ardakan, Sundeep Rangan and
Phil Schniter, Alyson Fletcher

Keywords Paper

0

0

0

0

3:24

06/12/2020

A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks

Zixiang Chen, Yuan Cao, Quanquan Gu, Tong Zhang

Keywords Paper

0

0

0

0

3:16

06/12/2020

A Group-Theoretic Framework for Data Augmentation

Shuxiao Chen, Edgar Dobriban, Jane Lee

Keywords Paper

0

0

0

0

3:28

06/12/2020

Task-Agnostic Amortized Inference of Gaussian Process Hyperparameters

Sulin Liu, Xingyuan Sun, Peter J Ramadge, Ryan Adams

Keywords Paper

0

0

0

0

3:46

03/05/2021

Theoretical bounds on estimation error for meta-learning

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

meta learning, minimax risk, few-shot, lower bounds, learning theory

0

0

0

0

4:46

12/07/2020

Generalization Error of Generalized Linear Models in High Dimensions

Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit and
Sundeep Rangan, Alyson Fletcher

Keywords Paper

Supervised Learning

0

0

0

0

15:08

06/12/2021

Rectangular Flows for Manifold Learning

Anthony Caterini, Gabriel Loaiza-Ganem, Geoff Pleiss, John Cunningham

Keywords Paper

deep learning, optimization, generative model

0

0

0

0

12:26

06/12/2021

Reliable Estimation of KL Divergence using a Discriminator in Reproducing Kernel Hilbert Space

Sandesh Ghimire, Aria Masoomi, Jennifer Dy

Keywords Paper

theory, deep learning, machine learning, kernel methods

0

0

0

0

14:58

06/12/2021

Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity

Scott Pesme, Loucas Pillaud-Vivien, Nicolas Flammarion

Keywords Paper

deep learning, optimization

0

0

0

0

4:30