Implicit Regularization in Deep Learning May Not Be Explainable by Norms

06/12/2020

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Noam Razin, Nadav Cohen

Keywords:

Abstract Paper Similar Papers

Abstract: Mathematically characterizing the implicit regularization induced by gradient-based optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks). It is an open question whether norms can explain the implicit regularization in matrix factorization. The current paper resolves this open question in the negative, by proving that there exist natural matrix factorization problems on which the implicit regularization drives all norms (and quasi-norms) towards infinity. Our results suggest that, rather than perceiving the implicit regularization via norms, a potentially more useful interpretation is minimization of rank. We demonstrate empirically that this interpretation extends to a certain class of non-linear neural networks, and hypothesize that it may be key to explaining generalization in deep learning.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Implicit Regularization in Tensor Factorization

Noam Razin, Asaf Maman, Nadav Cohen

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:11

18/07/2021

The Heavy-Tail Phenomenon in SGD

Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu

Keywords Paper

Optimization, Stochastic Optimization

0

0

0

0

5:37

18/07/2021

Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs

Tolga Ergen, Mert Pilanci

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

5:40

13/04/2021

Efficient methods for structured nonconvex-nonconcave min-max optimization

Jelena Diakonikolas, Constantinos Daskalakis, Michael Jordan

Keywords Paper

0

0

0

0

3:33

06/12/2021

Continuous vs. Discrete Optimization of Deep Neural Networks

Omer Elkabetz, Nadav Cohen

Keywords Paper

theory, deep learning, optimization

0

0

0

0

9:51

06/12/2021

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

Fan Bao, Guoqiang Wu, Chongxuan LI and
Jun Zhu, Bo Zhang

Keywords Paper

optimization

0

0

0

0

8:58

18/07/2021

Towards Understanding Learning in Neural Networks with Linear Teachers

Roei Sarussi, Alon Brutzkus, Amir Globerson

Keywords Paper

Probabilistic Methods, Theory, Probabilistic Methods, MCMC

0

0

0

0

5:22

06/12/2021

Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent

Spencer Frei, Quanquan Gu

Keywords Paper

deep learning, optimization

0

0

0

0

10:33

06/12/2020

Learning with Operator-valued Kernels in Reproducing Kernel Krein Spaces

Akash Saha, Balamurugan Palaniappan

Keywords Paper

0

0

0

0

3:22

06/12/2021

Generalization Guarantee of SGD for Pairwise Learning

Yunwen Lei, Mingrui Liu, Yiming Ying

Keywords Paper

optimization, machine learning

0

0

0

0

14:30

12/07/2020

On the Global Optimality of Model-Agnostic Meta-Learning

Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

15:14

06/12/2021

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

Alexander Camuto, George Deligiannidis, Murat Erdogdu and
Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:36

13/04/2021

Alternating direction method of multipliers for quantization

Tianjian Huang, Prajwal Singhania, Maziar Sanjabi and
Pabitra Mitra, Meisam Razaviyayn

Keywords Paper

1

0

0

0

2:43

13/04/2021

A dynamical view on optimization algorithms of overparameterized neural networks

Zhiqi Bu, Shiyun Xu, Kan Chen

Keywords Paper

0

0

0

0

3:05

02/02/2021

Characterizing the Loss Landscape in Non-Negative Matrix Factorization

Johan Bjorck, Anmol Kabra, Kilian Q. Weinberger, Carla Gomes

Keywords Paper

0

0

0

0

20:00

06/12/2021

Representation Learning Beyond Linear Prediction Functions

Ziping Xu, Ambuj Tewari

Keywords Paper

theory, deep learning, optimization, representation learning, few shot learning

0

0

0

0

11:00

06/12/2020

Non-Euclidean Universal Approximation

Anastasis Kratsios, Eugene Bilokopytov

Keywords Paper

0

0

0

0

3:34

06/12/2021

NTopo: Mesh-free Topology Optimization using Implicit Neural Representations

Jonas Zehnder, Yue Li, Stelian Coros, Bernhard Thomaszewski

Keywords Paper

deep learning, optimization, machine learning, self-supervised learning, representation learning

0

0

0

0

9:24

06/12/2020

Learning Optimal Representations with the Decodable Information Bottleneck

Yann Dubois, Douwe Kiela, David Schwab, Ramakrishna Vedantam

Keywords Paper

0

0

0

0

3:13

12/07/2020

Expert Learning through Generalized Inverse Multiobjective Optimization: Models, Insights and Algorithms

Chaosheng Dong, Bo Zeng

Keywords Paper

Learning Theory

0

0

0

0

12:11

18/07/2021

Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization

Wes Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:13

06/12/2021

Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels

Hao Wang, Yizhe Huang, Rui Gao, Flavio Calmon

Keywords Paper

theory, optimization, machine learning

0

0

0

0

12:27

06/12/2020

Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study

Assaf Dauber, Meir Feder, Tomer Koren, Roi Livni

Keywords Paper

0

0

0

0

2:58

06/12/2021

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Dibya Ghosh, Jad Rahme, Aviral Kumar and
Amy Zhang, Ryan Adams, Sergey Levine

Keywords Paper

reinforcement learning and planning

0

0

0

0

15:17

26/08/2020

Uncertainty Quantification for Sparse Deep Learning

Yuexi Wang, Veronika Rockova

Keywords Paper

0

0

0

0

15:12

06/12/2021

Finite Sample Analysis of Average-Reward TD Learning and $Q$-Learning

Sheng Zhang, Zhe Zhang, Siva Theja Maguluri

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

10:20

18/07/2021

Stability and Generalization of Stochastic Gradient Methods for Minimax Problems

Yunwen Lei, Zhenhuan Yang, Tianbao Yang, Yiming Ying

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

16:24

18/07/2021

Train simultaneously, generalize better: Stability of gradient-based minimax learners

Farzan Farnia, Asuman Ozdaglar

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:16

06/12/2020

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

Kenta Oono, Taiji Suzuki

Keywords Paper

0

0

0

0

3:22

04/08/2021

SGD Generalizes Better Than GD (And Regularization Doesn't Help)

Idan Amir, Tomer Koren, Roi Livni

Keywords Paper

0

0

0

0

15:53

26/04/2020

GLAD: Learning Sparse Graph Recovery

Harsh Shrivastava, Xinshi Chen, Binghong Chen and
Guanghui Lan, Srinivas Aluru, Han Liu, Le Song

Keywords Paper

Meta learning, automated algorithm design, learning structure recovery, Gaussian graphical models

0

0

0

0

5:31

12/07/2020

Optimistic bounds for multi-output learning

Henry Reeve, Ata Kaban

Keywords Paper

Supervised Learning

0

0

0

0

14:41

06/12/2021

Meta-Learning for Relative Density-Ratio Estimation

Atsutoshi Kumagai, Tomoharu Iwata, Yasuhiro Fujiwara

Keywords Paper

deep learning, machine learning, meta learning

0

0

0

0

8:56

26/04/2020

Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks

Wei Hu, Lechao Xiao, Jeffrey Pennington

Keywords Paper

deep learning theory, non-convex optimization, orthogonal initialization

0

0

0

0

5:10

26/04/2020

Implicit Bias of Gradient Descent based Adversarial Training on Separable Data

Yan Li, Ethan X.Fang, Huan Xu, Tuo Zhao

Keywords Paper

implicit bias, adversarial training, robustness, gradient descent

0

0

0

0

4:53

03/05/2021

Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability

Suraj Srinivas, François Fleuret

Keywords Paper

Interpretability, saliency maps, score-matching

0

0

0

0

15:08

03/05/2021

Global Convergence of Three-layer Neural Networks in the Mean Field Regime

Huy Tuan Pham, Phan-Minh Nguyen

Keywords Paper

deep learning theory

0

0

0

0

15:41

18/07/2021

Low-Rank Sinkhorn Factorization

Meyer Scetbon, Marco Cuturi, Gabriel Peyré

Keywords Paper

Algorithms, Optimal Transport

0

1

1

1

5:22

18/07/2021

Batch Value-function Approximation with Only Realizability

Tengyang Xie, Nan Jiang

Keywords Paper

Algorithms, Multitask and Transfer Learning, Algorithms, Unsupervised Learning; Applications, Image Segmentation, Theory, RL, Decisions and Control Theory

0

0

0

0

5:05

06/12/2020

Lipschitz Bounds and Provably Robust Training by Laplacian Smoothing

Vishaal Krishnan, Abed AlRahman Al Makdah, Fabio Pasqualetti

Keywords Paper

0

0

0

0

3:48