The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers

03/05/2021

The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers

Preetum Nakkiran, Behnam Neyshabur, Hanie Sedghi

Keywords: empirical investigation, online learning, optimization, generalization, understanding deep learning

Abstract Paper Similar Papers

Abstract: We propose a new framework for reasoning about generalization in deep learning. The core idea is to couple the Real World, where optimizers take stochastic gradient steps on the empirical loss, to an Ideal World, where optimizers take steps on the population loss. This leads to an alternate decomposition of test error into: (1) the Ideal World test error plus (2) the gap between the two worlds. If the gap (2) is universally small, this reduces the problem of generalization in offline learning to the problem of optimization in online learning. We then give empirical evidence that this gap between worlds can be small in realistic deep learning settings, in particular supervised image classification. For example, CNNs generalize better than MLPs on image distributions in the Real World, but this is "because" they optimize faster on the population loss in the Ideal World. This suggests our framework is a useful tool for understanding generalization in deep learning, and lays the foundation for future research in this direction.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Adaptive Discretization for Model-Based Reinforcement Learning

Sean Sinclair, Tianyu Wang, Gauri Jain and
Sid Banerjee, Christina Yu

Keywords Paper

0

0

0

0

3:12

07/09/2020

Few-Shot Learning with Complex-valued Neural Networks

Zhen Liu, Baochang Zhang, Guodong Guo

Keywords Paper

few-shot learning, complex-valued network, metric-learning, image classification

0

0

0

0

7:15

06/12/2021

Towards Sample-efficient Overparameterized Meta-learning

Yue Sun, Adhyyan Narang, Ibrahim Gulluk and
Samet Oymak, Maryam Fazel

Keywords Paper

theory, machine learning, meta learning, representation learning, few shot learning

0

0

0

0

13:54

03/05/2021

Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit

Ben Adlam, Jaehoon Lee, Lechao Xiao and
Jeffrey Pennington, Jasper Snoek

Keywords Paper

Deep Learning, Bayesian Neural Networks, Neural Network Gaussian Process, Infinite-Width Limit, Uncertainty, Gaussian Process

0

0

0

0

4:34

06/12/2020

Beyond the Mean-Field: Structured Deep Gaussian Processes Improve the Predictive Uncertainties

Jakob Lindinger, David Reeb, Christoph Lippert, Barbara Rakitsch

Keywords Paper

0

0

0

0

3:21

18/07/2021

Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

Zeke Xie, Li Yuan, Zhanxing Zhu, Masashi Sugiyama

Keywords Paper

Optimization, Stochastic Optimization

0

0

0

0

5:17

06/12/2021

Self-Supervised Learning with Kernel Dependence Maximization

Yazhe Li, Roman Pogodin, [deadname] J Sutherland, Arthur Gretton

Keywords Paper

machine learning, self-supervised learning, vision, representation learning, kernel methods, semi-supervised learning

0

0

0

0

11:48

18/07/2021

Understanding self-supervised learning dynamics without contrastive pairs

Yuandong Tian, Xinlei Chen, Surya Ganguli

Keywords Paper

Deep Learning, Optimization for Deep Networks

0

0

0

0

18:16

06/12/2021

Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning

ZHENHUAN YANG, Yunwen Lei, Puyu Wang and
Tianbao Yang, Yiming Ying

Keywords Paper

optimization, machine learning, privacy

0

0

0

0

14:40

05/01/2021

Temporal Stochastic Softmax for 3D CNNs: An Application in Facial Expression Recognition

Theo Ayral, Marco Pedersoli, Simon Bacon, Eric Granger

Keywords Paper

0

0

0

0

5:00

14/06/2020

What Deep CNNs Benefit From Global Covariance Pooling: An Optimization Perspective

Qilong Wang, Li Zhang, Banggu Wu and
Dongwei Ren, Peihua Li, Wangmeng Zuo, Qinghua Hu

Keywords Paper

global covariance pooling, deep cnns, loss lipschitzness, gradient predictiveness, second-order optimization, faster convergence, stronger robustness, generalization

0

0

0

0

1:01

06/12/2021

SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning

Talip Ucar, Ehsan Hajiramezanali, Lindsay Edwards

Keywords Paper

self-supervised learning, contrastive learning, representation learning

0

0

0

0

13:28

14/06/2020

Few-Shot Class-Incremental Learning

Xiaoyu Tao, Xiaopeng Hong, Xinyuan Chang and
Songlin Dong, Xing Wei, Yihong Gong

Keywords Paper

few-shot class-incremental learning, fscil, neural gas, ng, anchor loss, min-max loss

0

0

0

0

5:01

06/12/2021

CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks

Sakshi Varshney, Vinay Kumar Verma, P. K. Srijith and
Lawrence Carin, Piyush Rai

Keywords Paper

generative model, representation learning, continual learning

0

0

0

0

14:50

18/07/2021

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Haoxiang Wang, Han Zhao, Bo Li

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

5:01

06/12/2021

Boost Neural Networks by Checkpoints

Feng Wang, Guoyizhe Wei, Qiao Liu and
Jinxiang Ou, xian wei, Hairong Lv

Keywords Paper

deep learning

1

0

0

0

4:45

03/05/2021

Generalized Variational Continual Learning

Noel Loo, Siddharth Swaroop, Rich E Turner

Keywords Paper

0

0

0

0

5:30

12/07/2020

Informative Dropout for Robust Representation Learning: A Shape-bias Perspective

Baifeng Shi, Dinghuai Zhang, Qi Dai and
Jingdong Wang, Zhanxing Zhu, Yadong Mu

Keywords Paper

Accountability, Transparency and Interpretability

0

0

0

0

14:58

14/06/2020

Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking

Jin Gao, Weiming Hu, Yan Lu

Keywords Paper

online learning, visual tracking, continual learning, recursive least-squares estimation, deep learning, memory retention, recursive learning, mini-batch sgd, normal equation, mlp layer

0

0

0

0

5:01

18/07/2021

LAMDA: Label Matching Deep Domain Adaptation

Trung Le, Tuan Nguyen, Nhat Ho and
Hung Bui, Dinh Phung

Keywords Paper

Theory, Deep learning Theory

0

0

0

1

5:14

06/12/2021

Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning

Yifan Zhang, Bryan Hooi, Dapeng Hu and
Jian Liang, Jiashi Feng

Keywords Paper

optimization, machine learning, self-supervised learning, vision, contrastive learning, representation learning, transfer learning

0

0

0

0

14:34

06/12/2021

Sparse Flows: Pruning Continuous-depth Models

Lucas Liebenwein, Ramin Hasani, Alexander Amini, Daniela Rus

Keywords Paper

deep learning, generative model

0

0

0

0

12:51

06/12/2021

A Max-Min Entropy Framework for Reinforcement Learning

Seungyul Han, Youngchul Sung

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

14:35

22/11/2021

Siamese Prototypical Contrastive Learning

Shentong Mo, Zhun Sun, Chao Li

Keywords Paper

self-supervised learning, contrastive learning, representation learning

0

0

0

0

2:50

12/07/2020

Supervised learning: no loss no cry

Richard Nock, Aditya Menon

Keywords Paper

Learning Theory

0

0

0

0

15:18

06/12/2021

Contrastively Disentangled Sequential Variational Autoencoder

Junwen Bai, Weiran Wang, Carla Gomes

Keywords Paper

self-supervised learning, generative model, contrastive learning, representation learning, interpretability

0

0

0

0

12:53

12/07/2020

Analyzing the effect of neural network architecture on training performance

Karthik Abinav Sankararaman, Soham De, Zheng Xu and
W. Ronny Huang, Tom Goldstein

Keywords Paper

Deep Learning - Theory

0

0

0

0

14:03

02/02/2021

Deep Frequency Principle Towards Understanding Why Deeper Learning Is Faster

Zhiqin John Xu, Hanxu Zhou

Keywords Paper

0

0

0

0

19:40

02/02/2021

Adversarial Training Reduces Information and Improves Transferability

Matteo Terzi, Alessandro Achille, Marco Maggipinto, Gian Antonio Susto

Keywords Paper

0

0

0

0

19:54

12/07/2020

Meta-learning with Stochastic Linear Bandits

Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil

Keywords Paper

Transfer, Multitask and Meta-learning

1

1

0

0

13:17

18/07/2021

DriftSurf: Stable-State / Reactive-State Learning under Concept Drift

Ashraf Tahmasbi, Ellango Jothimurugesan, Srikanta Tirthapura, Phil Gibbons

Keywords Paper

Algorithms, Online Learning Algorithms

0

0

0

0

5:07

06/12/2021

Perturb-and-max-product: Sampling and learning in discrete energy-based models

Miguel Lazaro-Gredilla, Antoine Dedieu, Dileep George

Keywords Paper

generative model, graph learning

0

0

0

0

14:16

12/07/2020

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:21

03/05/2021

Distance-Based Regularisation of Deep Networks for Fine-Tuning

Henry Gouk, Timothy Hospedales, massimiliano pontil

Keywords Paper

Statistical Learning Theory, Transfer Learning, Deep Learning

0

0

0

0

4:57

26/08/2020

Gain with no Pain: Efficiency of Kernel-PCA by Nystr\'om Sampling

Nicholas Sterge, Bharath Sriperumbudur, Lorenzo Rosasco, Alessandro Rudi

Keywords Paper

0

0

0

0

14:44

06/12/2021

Shift Invariance Can Reduce Adversarial Robustness

Vasu Singla, Songwei Ge, Basri Ronen, David Jacobs

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security

0

0

0

0

8:28

06/12/2021

Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time

Ferran Alet, Maria Bauza, Kenji Kawaguchi and
Nurullah Giray Kuru, Tomás Lozano-Pérez, Leslie Kaelbling

Keywords Paper

deep learning, optimization, machine learning, self-supervised learning, meta learning

0

0

0

0

15:05

06/12/2021

See More for Scene: Pairwise Consistency Learning for Scene Classification

Gongwei Chen, Xinhang Song, Bohan Wang, Shuqiang Jiang

Keywords Paper

deep learning, machine learning

0

0

0

0

9:15

13/04/2021

Regularized ERM on random subspaces

Andrea Della Vecchia, Jaouad Mourtada, Ernesto De Vito, Lorenzo Rosasco

Keywords Paper

0

0

0

0

2:57

06/12/2020

Gradient-EM Bayesian Meta-Learning

Yayi Zou, Xiaoqi Lu

Keywords Paper

0

0

0

0

3:23