Deep Equals Shallow for ReLU Networks in Kernel Regimes

03/05/2021

Deep Equals Shallow for ReLU Networks in Kernel Regimes

Alberto Bietti, Francis Bach

Keywords: approximation, neural tangent kernels, deep learning, kernels

Abstract Paper Similar Papers

Abstract: Deep networks are often considered to be more expressive than shallow ones in terms of approximation. Indeed, certain functions can be approximated by deep networks provably more efficiently than by shallow ones, however, no tractable algorithms are known for learning such deep models. Separately, a recent line of work has shown that deep networks trained with gradient descent may behave like (tractable) kernel methods in a certain over-parameterized regime, where the kernel is determined by the architecture and initialization, and this paper focuses on approximation for such kernels. We show that for ReLU activations, the kernels derived from deep fully-connected networks have essentially the same approximation properties as their shallow two-layer counterpart, namely the same eigenvalue decay for the corresponding integral operator. This highlights the limitations of the kernel framework for understanding the benefits of such deep architectures. Our main theoretical result relies on characterizing such eigenvalue decays through differentiability properties of the kernel function, which also easily applies to the study of other kernels defined on the sphere.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Monotone operator equilibrium networks

Ezra Winston, J. Zico Kolter

Keywords Paper

0

0

0

0

3:29

26/04/2020

On Universal Equivariant Set Networks

Nimrod Segol, Yaron Lipman

Keywords Paper

deep learning, universality, set functions, equivariance

0

0

0

0

5:02

06/12/2020

Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? --- A Neural Tangent Kernel Perspective

Kaixuan Huang, Yuqing Wang, Molei Tao, Tuo Zhao

Keywords Paper

Algorithms -> Uncertainty Estimation; Theory -> Frequentist Statistics; Theory -> Large Deviations and Asymptotic Analysis; The, Algorithms -> Kernel Methods

0

0

0

0

2:59

13/04/2021

A dynamical view on optimization algorithms of overparameterized neural networks

Zhiqi Bu, Shiyun Xu, Kan Chen

Keywords Paper

0

0

0

0

3:05

12/07/2020

Duality in RKHSs with Infinite Dimensional Outputs: Application to Robust Losses

Pierre Laforgue, Alex Lambert, Luc Brogat-Motte, Florence d'Alche-Buc

Keywords Paper

General Machine Learning Techniques

0

0

0

0

14:36

26/04/2020

On Robustness of Neural Ordinary Differential Equations

Hanshu YAN, Jiawei DU, Vincent TAN, Jiashi FENG

Keywords Paper

Neural ODE

0

0

0

0

5:09

06/12/2020

Non-Euclidean Universal Approximation

Anastasis Kratsios, Eugene Bilokopytov

Keywords Paper

0

0

0

0

3:34

04/08/2021

The Connection Between Approximation, Depth Separation and Learnability in Neural Networks

Eran Malach, Gilad Yehudai, Shai Shalev-Schwartz, Ohad Shamir

Keywords Paper

0

0

0

0

16:50

14/09/2020

Off-the-grid: fast and effective hyperparameter search for kernel clustering

Bruno Ordozgoiti, Lluís Belanche

Keywords Paper

clustering, kernels, kernel k-means, hyperparameter tuning, grid search

0

0

0

0

15:47

18/07/2021

Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation

Aurick Zhou, Sergey Levine

Keywords Paper

Deep Learning, Bayesian Deep Learning

0

0

0

0

5:05

06/12/2021

Representation Learning Beyond Linear Prediction Functions

Ziping Xu, Ambuj Tewari

Keywords Paper

theory, deep learning, optimization, representation learning, few shot learning

0

0

0

0

11:00

12/07/2020

A Mean Field Analysis Of Deep ResNet And Beyond: Towards Provably Optimization Via Overparameterization From Depth

Yiping Lu, Chao Ma, Yulong Lu and
Jianfeng Lu, Lexing Ying

Keywords Paper

Deep Learning - Theory

0

0

0

0

4:37

02/02/2021

Avoiding Kernel Fixed Points: Computing with ELU and GELU Infinite Networks

Russell Tsuchida, Tim Pearce, Chris van der Heide and
Fred Roosta, Marcus Gallagher

Keywords Paper

0

0

0

0

13:47

06/12/2020

Semialgebraic Optimization for Lipschitz Constants of ReLU Networks

Tong Chen, Jean Lasserre, Victor Magron, Edouard Pauwels

Keywords Paper

0

0

0

0

3:22

06/12/2021

Rectangular Flows for Manifold Learning

Anthony Caterini, Gabriel Loaiza-Ganem, Geoff Pleiss, John Cunningham

Keywords Paper

deep learning, optimization, generative model

0

0

0

0

12:26

18/07/2021

Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed

Maria Refinetti, Sebastian Goldt, FLORENT KRZAKALA, Lenka Zdeborova

Keywords Paper

Theory, Models of Learning and Generalization

0

0

0

0

4:24

26/04/2020

Effect of Activation Functions on the Training of Overparametrized Neural Nets

Abhishek Panigrahi, Abhishek Shetty, Navin Goyal

Keywords Paper

activation functions, deep learning theory, neural networks

0

0

0

0

5:13

09/07/2020

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

Lénaïc Chizat, Francis Bach

Keywords Paper

Neural networks/deep learning, Non-convex optimization

0

0

0

0

14:41

06/12/2020

Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement

Miao Zhang, Huiqi Li, Shirui Pan and
Xiaojun Chang, Zongyuan Ge, Steven Su

Keywords Paper

0

0

0

0

3:22

02/02/2021

A Recipe for Global Convergence Guarantee in Deep Neural Networks

Kenji Kawaguchi, Qingyun Sun

Keywords Paper

0

0

0

0

17:15

06/12/2021

Explicit loss asymptotics in the gradient descent training of neural networks

Maksim Velikanov, Dmitry Yarotsky

Keywords Paper

theory, deep learning, optimization

0

0

0

0

9:54

14/06/2020

Generating Accurate Pseudo-Labels in Semi-Supervised Learning and Avoiding Overconfident Predictions via Hermite Polynomial Activations

Vishnu Suresh Lokhande, Songwong Tasneeyapant, Abhay Venkatesh and
Sathya N. Ravi, Vikas Singh

Keywords Paper

hermite polynomials, activation functions, relu, pseudo-labels, semi-supervised learning, faster convergence, noise tolerance, smoothness, polynomial networks, resnet.

0

0

0

0

1:01

14/06/2020

Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution

Yong Guo, Jian Chen, Jingdong Wang and
Qi Chen, Jiezhang Cao, Zeshuai Deng, Yanwu Xu, Mingkui Tan

Keywords Paper

computer vision, image super-resolution, dual regression scheme, closed-loop

0

0

0

0

1:01

26/04/2020

Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem

Vaggos Chatziafratis, Sai Ganesh Nagarajan, Ioannis Panageas, Xiao Wang

Keywords Paper

Depth-Width trade-offs, ReLU networks, chaos theory, Sharkovsky Theorem, dynamical systems

0

0

0

0

5:02

06/12/2020

Coupling-based Invertible Neural Networks Are Universal Diffeomorphism Approximators

Takeshi Teshima, Isao Ishikawa, Koichi Tojo and
Kenta Oono, Masahiro Ikeda, Masashi Sugiyama

Keywords Paper

0

0

0

0

3:14

18/07/2021

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

Bohan Wang, Qi Meng, Wei Chen, Tie-Yan Liu

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

16:53

06/12/2021

Adversarial Examples in Multi-Layer Random ReLU Networks

Peter Bartlett, Sebastien Bubeck, Yeshwanth Cherapanamjeri

Keywords Paper

theory, adversarial robustness and security

0

0

0

0

10:49

02/02/2021

Memory and Computation-Efficient Kernel SVM via Binary Embedding and Ternary Model Coefficients

Zijian Lei, Liang Lan

Keywords Paper

0

0

0

0

12:29

06/12/2021

Pure Exploration in Kernel and Neural Bandits

Yinglun Zhu, Dongruo Zhou, Ruoxi Jiang and
Quanquan Gu, Rebecca Willett, Robert Nowak

Keywords Paper

theory, deep learning, reinforcement learning and planning, bandits, representation learning

0

0

0

0

14:47

18/07/2021

Scaling Properties of Deep Residual Networks

Alain-Sam Cohen, Rama Cont, Alain Rossier, Renyuan Xu

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:20

03/05/2021

Multiplicative Filter Networks

Rizal Fathony, Anit Kumar Sahu, Devin Willmott, Zico Kolter

Keywords Paper

Fourier Features, Implicit Neural Representations, Deep Architectures

0

0

0

0

6:06

18/07/2021

Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks

Hao Liu, Minshuo Chen, Tuo Zhao, Wenjing Liao

Keywords Paper

Applications, Computer Vision, , Theory, Deep learning Theory

0

0

0

0

5:14

12/07/2020

Efficient proximal mapping of the path-norm regularizer of shallow networks

Fabian Latorre, Paul Rolland, Shaul Nadav Hallak, Volkan Cevher

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

11:32

06/12/2021

Going Beyond Linear RL: Sample Efficient Neural Function Approximation

Baihe Huang, Kaixuan Huang, Sham Kakade and
Jason Lee, Qi Lei, Runzhe Wang, Jiaqi Yang

Keywords Paper

theory, deep learning, reinforcement learning and planning, generative model

0

0

0

0

12:17

06/12/2021

Kernel Functional Optimisation

Arun Kumar Anjanapura Venkatesh, Alistair Shilton, Santu Rana and
Sunil Gupta, Svetha Venkatesh

Keywords Paper

machine learning, kernel methods

0

0

0

0

12:48

06/12/2021

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

Fan Bao, Guoqiang Wu, Chongxuan LI and
Jun Zhu, Bo Zhang

Keywords Paper

optimization

0

0

0

0

8:58

18/07/2021

Self Normalizing Flows

T. Anderson Keller, Jorn Peters, Priyank Jaini and
Emiel Hoogeboom, Patrick Forré, Max Welling

Keywords Paper

Deep Learning, Generative Models

0

1

1

0

4:24

14/06/2020

Generalized Zero-Shot Learning via Over-Complete Distribution

Rohit Keshari, Richa Singh, Mayank Vatsa

Keywords Paper

deep learning, zero-shot leaning, cvae, triplet loss, center loss

0

0

0

0

0:50

02/02/2021

Multi-Proxy Wasserstein Classifier for Image Classification

Benlin Liu, Yongming Rao, Jiwen Lu and
Jie Zhou, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

12:05

06/12/2021

What can linearized neural networks actually say about generalization?

Guillermo Ortiz-Jimenez, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard

Keywords Paper

theory, deep learning

0

0

0

0

9:46