Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks

26/04/2020

Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks

Sanjeev Arora, Simon S. Du, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang, Dingli Yu

Keywords: small data, neural tangent kernel, UCI database, few-shot learning, kernel SVMs, deep learning theory, kernel design

Abstract Paper Code Similar Papers

Abstract: Recent research shows that the following two models are equivalent: (a) infinitely wide neural networks (NNs) trained under l2 loss by gradient descent with infinitesimally small learning rate (b) kernel regression with respect to so-called Neural Tangent Kernels (NTKs) (Jacot et al., 2018). An efficient algorithm to compute the NTK, as well as its convolutional counterparts, appears in Arora et al. (2019a), which allowed studying performance of infinitely wide nets on datasets like CIFAR-10. However, super-quadratic running time of kernel methods makes them best suited for small-data tasks. We report results suggesting neural tangent kernels perform strongly on low-data tasks. 1. On a standard testbed of classification/regression tasks from the UCI database, NTK SVM beats the previous gold standard, Random Forests (RF), and also the corresponding finite nets. 2. On CIFAR-10 with 10 – 640 training samples, Convolutional NTK consistently beats ResNet-34 by 1% - 3%. 3. On VOC07 testbed for few-shot image classification tasks on ImageNet with transfer learning (Goyal et al., 2019), replacing the linear SVM currently used with a Convolutional NTK SVM consistently improves performance. 4. Comparing the performance of NTK with the finite-width net it was derived from, NTK behavior starts at lower net widths than suggested by theoretical analysis(Arora et al., 2019a). NTK’s efficacy may trace to lower variance of output.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias

Kaifeng Lyu, Zhiyuan Li, Runzhe Wang, Sanjeev Arora

Keywords Paper

deep learning, optimization, machine learning

0

0

0

0

14:56

05/01/2021

Spike-Thrift: Towards Energy-Efficient Deep Spiking Neural Networks by Limiting Spiking Activity via Attention-Guided Compression

Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

Keywords Paper

0

0

0

0

5:22

18/07/2021

What Are Bayesian Neural Network Posteriors Really Like?

Pavel Izmailov, Sharad Vikram, Matt Hoffman, Andrew Wilson

Keywords Paper

Deep Learning, Bayesian Deep Learning

0

0

0

0

17:13

06/12/2021

Boost Neural Networks by Checkpoints

Feng Wang, Guoyizhe Wei, Qiao Liu and
Jinxiang Ou, xian wei, Hairong Lv

Keywords Paper

deep learning

1

0

0

0

4:45

06/12/2021

The Implicit Bias of Minima Stability: A View from Function Space

Rotem Mulayoff, Tomer Michaeli, Daniel Soudry

Keywords Paper

deep learning, optimization

0

0

0

0

13:51

26/04/2020

Data-Independent Neural Pruning via Coresets

Ben Mussay, Margarita Osadchy, Vladimir Braverman and
Samson Zhou, Dan Feldman

Keywords Paper

coresets, neural pruning, network compression

0

0

0

0

4:23

06/12/2021

Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks

Melih Barsbey, Milad Sefidgaran, Murat Erdogdu and
Gaël Richard, Umut Simsekli

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:25

14/06/2020

Correction Filter for Single Image Super-Resolution: Robustifying Off-the-Shelf Deep Super-Resolvers

Shady Abu Hussein, Tom Tirer, Raja Giryes

Keywords Paper

super-resolution, blind super-resolution, correction filter, off-the-shelf, convolutional neural networks, super-resolvers, plug-and-play, image-retrieving

0

0

0

0

4:56

03/05/2021

Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit

Ben Adlam, Jaehoon Lee, Lechao Xiao and
Jeffrey Pennington, Jasper Snoek

Keywords Paper

Deep Learning, Bayesian Neural Networks, Neural Network Gaussian Process, Infinite-Width Limit, Uncertainty, Gaussian Process

0

0

0

0

4:34

06/12/2020

When Do Neural Networks Outperform Kernel Methods?

Behrooz Ghorbani, Song Mei, Theodor Misiakiewicz, Andrea Montanari

Keywords Paper

0

0

0

0

3:16

12/07/2020

Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript

Fangcheng Fu, Yuzheng Hu, Yihan He and
Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

9:59

06/12/2020

Asymptotic normality and confidence intervals for derivatives of 2-layers neural network in the random features model

Yiwei Shen, Pierre C Bellec

Keywords Paper

0

0

0

0

3:12

04/08/2021

Nonparametric Regression with Shallow Overparametrized Neural Networks Trained by GD with Early Stopping

Ilja Kuzborskij , Csaba Szepesvari

Keywords Paper

0

0

0

0

15:14

26/04/2020

Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation

Nitin Rathi, Gopalakrishnan Srinivasan, Priyadarshini Panda, Kaushik Roy

Keywords Paper

spiking neural networks, ann-snn conversion, spike-based backpropagation, imagenet

0

0

0

0

4:44

06/12/2021

Heavy Ball Neural Ordinary Differential Equations

Hedi Xia, Vai Suliafu, Hangjie Ji and
Tan Nguyen, Andrea Bertozzi, Stanley Osher, Bao Wang

Keywords Paper

deep learning, optimization, machine learning, vision

0

0

0

0

4:08

18/07/2021

PHEW : Constructing Sparse Networks that Learn Fast and Generalize Well without Training Data

Shreyas Malakarjun Patil, Constantine Dovrolis

Keywords Paper

Deep Learning

1

1

0

1

5:20

06/12/2020

Finite Versus Infinite Neural Networks: an Empirical Study

Jaehoon Lee, Sam Schoenholz, Jeffrey Pennington and
Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein

Keywords Paper

0

0

0

0

3:27

03/05/2021

Learning Energy-Based Generative Models via Coarse-to-Fine Expanding and Sampling

Yang Zhao, Jianwen Xie, Ping Li

Keywords Paper

generative model, image translation, Energy-based model

0

0

0

0

5:57

03/05/2021

Understanding Over-parameterization in Generative Adversarial Networks

Yogesh Balaji, Mohammadmahdi Sajedi, Neha Kalibhat and
Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi

Keywords Paper

min-max optimization, Over-parameterization, GAN

0

0

0

0

5:04

06/12/2020

Statistical-Query Lower Bounds via Functional Gradients

Surbhi Goel, Aravind Gollakota, Adam Klivans

Keywords Paper

0

0

0

0

3:24

02/02/2021

STL-SGD: Speeding Up Local SGD with Stagewise Communication Period

Shuheng Shen, Yifei Cheng, Jingchang Liu, Linli Xu

Keywords Paper

0

0

0

0

14:53

06/12/2021

Progressive Feature Interaction Search for Deep Sparse Network

Chen Gao, Yinfeng Li, Quanming Yao and
Depeng Jin, Yong Li

Keywords Paper

deep learning, machine learning

1

0

0

0

14:01

06/12/2021

Powerpropagation: A sparsity inducing weight reparameterisation

Jonathan Schwarz, Siddhant M Jayakumar, Razvan Pascanu and
Peter E Latham, Yee Teh

Keywords Paper

deep learning, optimization, continual learning

0

0

0

1

9:08

12/07/2020

Soft Threshold Weight Reparameterization for Learnable Sparsity

Aditya Kusupati, Vivek Ramanujan, Raghav Somani and
Mitchell Wortsman, Prateek Jain, Sham Kakade, Ali Farhadi

Keywords Paper

Applications - Computer Vision

0

0

0

0

14:24

14/06/2020

HRank: Filter Pruning Using High-Rank Feature Map

Mingbao Lin, Rongrong Ji, Yan Wang and
Yichen Zhang, Baochang Zhang, Yonghong Tian, Ling Shao

Keywords Paper

network pruning, neural network compression and acceleration, high-rank feature map, efficient deep learning computing

0

0

0

0

4:57

18/07/2021

On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

Peizhong Ju, Xiaojun Lin, Ness Shroff

Keywords Paper

Theory, Models of Learning and Generalization

0

0

0

0

5:16

06/12/2021

On the Equivalence between Neural Network and Support Vector Machine

Yilan Chen, Wei Huang, Lam Nguyen, Tsui-Wei Weng

Keywords Paper

theory, deep learning, optimization, robustness

0

0

0

0

15:12

14/06/2020

Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer

Yerlan Idelbayev, Miguel Á. Carreira-Perpiñán

Keywords Paper

low-rank compression, rank selection, optimization, discrete-continuous optimization

0

0

0

0

1:00

03/05/2021

A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima

Zeke Xie, Issei Sato, Masashi Sugiyama

Keywords Paper

flat minima, SGD, deep learning dynamics, stochastic optimization, diffusion

0

0

0

0

4:37

26/04/2020

Finite Depth and Width Corrections to the Neural Tangent Kernel

Boris Hanin, Mihai Nica

Keywords Paper

Neural Tangent Kernel, Finite Width Corrections, Random ReLU Net, Wide Networks, Deep Networks

0

0

0

0

5:09

18/07/2021

Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks

Greg Yang, Edward Hu

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:22

12/07/2020

A Mean Field Analysis Of Deep ResNet And Beyond: Towards Provably Optimization Via Overparameterization From Depth

Yiping Lu, Chao Ma, Yulong Lu and
Jianfeng Lu, Lexing Ying

Keywords Paper

Deep Learning - Theory

0

0

0

0

4:37

18/07/2021

Understanding self-supervised learning dynamics without contrastive pairs

Yuandong Tian, Xinlei Chen, Surya Ganguli

Keywords Paper

Deep Learning, Optimization for Deep Networks

0

0

0

0

18:16

14/06/2020

Generalized Zero-Shot Learning via Over-Complete Distribution

Rohit Keshari, Richa Singh, Mayank Vatsa

Keywords Paper

deep learning, zero-shot leaning, cvae, triplet loss, center loss

0

0

0

0

0:50

06/12/2020

Fourier Sparse Leverage Scores and Approximate Kernel Learning

Tamas Erdelyi, Cameron Musco, Christopher Musco

Keywords Paper

0

0

0

0

3:25

06/12/2021

When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work

Jiawei Zhang, Yushun Zhang, Mingyi Hong and
Ruoyu Sun, Zhi-Quan Luo

Keywords Paper

deep learning, optimization

0

0

0

0

11:22

06/12/2020

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

Jiawei Zhang, Peijun Xiao, Ruoyu Sun, Zhiquan Luo

Keywords Paper

0

0

0

0

3:12

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05

06/12/2020

ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Yibo Yang, Hongyang Li, Shan You and
Fei Wang, Chen Qian, Zhouchen Lin

Keywords Paper

0

0

0

0

3:19

13/04/2021

One-round communication efficient distributed m-estimation

Yajie Bao, Weijia Xiong

Keywords Paper

0

0

0

0

3:00