Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning

03/05/2021

Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning

Zhiyuan Li, Yuping Luo, Kaifeng Lyu

Keywords: implicit regularization, gradient descent, matrix factorization, implicit bias

Abstract Paper Similar Papers

Abstract: Matrix factorization is a simple and natural test-bed to investigate the implicit regularization of gradient descent. Gunasekar et al. (2017) conjectured that gradient flow with infinitesimal initialization converges to the solution that minimizes the nuclear norm, but a series of recent papers argued that the language of norm minimization is not sufficient to give a full characterization for the implicit regularization. In this work, we provide theoretical and empirical evidence that for depth-2 matrix factorization, gradient flow with infinitesimal initialization is mathematically equivalent to a simple heuristic rank minimization algorithm, Greedy Low-Rank Learning, under some reasonable assumptions. This generalizes the rank minimization view from previous works to a much broader setting and enables us to construct counter-examples to refute the conjecture from Gunasekar et al. (2017). We also extend the results to the case where depth >= 3, and we show that the benefit of being deeper is that the above convergence has a much weaker dependence over initialization magnitude so that this rank minimization is more likely to take effect for initialization with practical scale.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

Fan Bao, Guoqiang Wu, Chongxuan LI and
Jun Zhu, Bo Zhang

Keywords Paper

optimization

0

0

0

0

8:58

12/07/2020

Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization

Hien Le, Nicolas Gillis, Panagiotis Patrinos

Keywords Paper

Optimization - General

0

0

0

0

15:31

06/12/2021

Fine-grained Generalization Analysis of Inductive Matrix Completion

Antoine Ledent, Rodrigo Alves, Yunwen Lei, Marius Kloft

Keywords Paper

theory

0

0

0

0

14:33

06/12/2020

Random Reshuffling: Simple Analysis with Vast Improvements

Konstantin Mishchenko, Ahmed Khaled Ragab Bayoumi, Peter Richtarik

Keywords Paper

Reinforcement Learning and Planning -> Planning; Reinforcement Learning and Planning -> Reinforcement Learning, Reinforcement Learning and Planning

0

0

0

0

3:08

06/12/2020

Demystifying Orthogonal Monte Carlo and Beyond

Han Lin, Haoxian Chen, Krzysztof M Choromanski and
Tianyi Zhang, Clement Laroche

Keywords Paper

0

0

0

0

3:19

06/12/2020

Projection Robust Wasserstein Distance and Riemannian Optimization

Darren Lin, Chenyou Fan, Nhat Ho and
Marco Cuturi, Michael Jordan

Keywords Paper

Optimization -> Non-Convex Optimization; Optimization -> Stochastic Optimization, Deep Learning -> Optimization for Deep Networks

0

0

0

1

3:01

06/12/2020

A novel variational form of the Schatten-$p$ quasi-norm

Paris Giampouras, Rene Vidal, Athanasios Rontogiannis, Benjamin Haeffele

Keywords Paper

0

0

0

0

3:14

13/04/2021

Explicit regularization of stochastic gradient methods through duality

Anant Raj, Francis Bach

Keywords Paper

0

0

0

0

2:53

06/12/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Paper

optimization, machine learning

0

0

0

0

14:51

12/07/2020

A simpler approach to accelerated optimization: iterative averaging meets optimism

Pooria Joulani, Anant Raj, András György, Csaba Szepesvari

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

1

1

16:17

26/08/2020

One Sample Stochastic Frank-Wolfe

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and
Hamed Hassani, Amin Karbasi

Keywords Paper

0

0

0

0

6:05

26/04/2020

Stable Rank Normalization for Improved Generalization in Neural Networks and GANs

Amartya Sanyal, Philip H. Torr, Puneet K. Dokania

Keywords Paper

Generelization, regularization, empirical lipschitz

0

0

0

0

5:25

26/08/2020

On the Convergence Theory of Gradient-Based Model-Agnostic Meta-Learning Algorithms

Alireza Fallah, Aryan Mokhtari, Asuman Ozdaglar

Keywords Paper

0

0

0

0

15:02

26/04/2020

Accelerating SGD with momentum for over-parameterized learning

Chaoyue Liu, Mikhail Belkin

Keywords Paper

SGD, acceleration, momentum, stochastic, over-parameterized, Nesterov

0

0

0

0

4:50

22/06/2020

Positive semidefinite programming: Mixed, parallel, and width-independent

Arun Jambulapati, Yin Tat Lee, Jerry Li and
Swati Padmanabhan, Kevin Tian

Keywords Paper

semidefinite programming, approximation algorithm, mixed packing and covering, width-independent algorithm, parallel algorithm

0

0

0

0

18:12

02/02/2021

Robust Model Compression Using Deep Hypotheses

Omri Armstrong, Ran Gilad-Bachrach

Keywords Paper

0

0

0

0

17:26

26/04/2020

Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin

Colin Wei, Tengyu Ma

Keywords Paper

deep learning theory, generalization bounds, adversarially robust generalization, data-dependent generalization bounds

0

0

0

0

5:30

18/07/2021

Active Slices for Sliced Stein Discrepancy

Wenbo Gong, Kaibo Zhang, Yingzhen Li, Jose Miguel Hernandez-Lobato

Keywords Paper

, Deep Learning, Efficient Inference Methods, Algorithms, Kernel Methods

0

0

0

0

5:47

04/08/2021

Convergence rates and approximation results for SGD and its continuous-time counterpart

Xavier Fontaine, Valentin De Bortoli, Alain Durmus

Keywords Paper

0

0

0

0

17:35

18/07/2021

Three Operator Splitting with a Nonconvex Loss Function

Alp Yurtsever, Varun Mangalick, Suvrit Sra

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:03

18/07/2021

On the Convergence of Hamiltonian Monte Carlo with Stochastic Gradients

Difan Zou, Quanquan Gu

Keywords Paper

Probabilistic Methods, Monte Carlo Methods

0

0

0

0

5:39

06/12/2021

Structured Dropout Variational Inference for Bayesian Neural Networks

Son Nguyen, Duong Nguyen, Khai Nguyen and
Khoat Than, Hung Bui, Nhat Ho

Keywords Paper

deep learning, generative model

0

0

0

0

11:28

26/08/2020

Accelerated Factored Gradient Descent for Low-Rank Matrix Factorization

Dongruo Zhou, Yuan Cao, Quanquan Gu

Keywords Paper

0

0

0

0

15:38

03/05/2021

Distance-Based Regularisation of Deep Networks for Fine-Tuning

Henry Gouk, Timothy Hospedales, massimiliano pontil

Keywords Paper

Statistical Learning Theory, Transfer Learning, Deep Learning

0

0

0

0

4:57

06/12/2021

Beyond Smoothness: Incorporating Low-Rank Analysis into Nonparametric Density Estimation

Robert A Vandermeulen, Antoine Ledent

Keywords Paper

theory

0

0

0

0

12:58

06/12/2020

Tensor Completion Made Practical

Allen Liu, Ankur Moitra

Keywords Paper

Neuroscience and Cognitive Science -> Neuroscience; Neuroscience and Cognitive Science -> Plasticity and Adaptation; Neuroscien, Neuroscience and Cognitive Science

0

0

0

0

3:05

18/07/2021

Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji, Junjie Yang, Yingbin LIANG

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:02

09/07/2020

How to trap a gradient flow

Dan Mikulincer, Sebastien Bubeck

Keywords Paper

Non-convex optimization,

0

0

0

0

15:01

06/12/2020

Asymptotic Guarantees for Generative Modeling Based on the Smooth Wasserstein Distance

Ziv Goldfeld, Kristjan Greenewald, Kengo Kato

Keywords Paper

0

0

0

0

3:16

06/12/2021

Smooth Bilevel Programming for Sparse Regularization

Clarice Poon, Gabriel Peyré

Keywords Paper

machine learning

0

0

0

0

13:06

12/07/2020

A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton

Risheng Liu, Pan Mu, Xiaoming Yuan and
Shangzhi Zeng, Jin Zhang

Keywords Paper

Optimization - Non-convex

0

0

0

0

13:51

06/12/2021

An Online Riemannian PCA for Stochastic Canonical Correlation Analysis

Zihang Meng, Rudrasis Chakraborty, Vikas Singh

Keywords Paper

optimization, fairness

0

0

0

0

14:14

09/07/2020

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

Zhixian Lei, Kyle Luh, Prayaag Venkat, Fred Zhang

Keywords Paper

High-dimensional statistics, Adversarial learning and robustness

0

0

0

0

15:00

06/12/2021

Sparse Quadratic Optimisation over the Stiefel Manifold with Application to Permutation Synchronisation

Florian Bernard, Daniel Cremers, Johan Thunberg

Keywords Paper

vision, graph learning, clustering

0

0

0

0

11:13

22/06/2020

Sampling-based sublinear low-rank matrix arithmetic framework for dequantizing quantum machine learning

Nai-Hui Chia, András Gilyén, Tongyang Li and
Han-Hsuan Lin, Ewin Tang, Chunhao Wang

Keywords Paper

sampling, quantum-inspired algorithms, quantum machine learning, low-rank approximation, dequantization

0

0

0

0

20:48

05/01/2021

Constrained Weight Optimization for Learning Without Activation Normalization

Daiki Ikami, Go Irie, Takashi Shibata

Keywords Paper

0

0

0

0

3:27

12/07/2020

Variance Reduction in Stochastic Particle-Optimization Sampling

Jianyi Zhang, Yang Zhao, Changyou Chen

Keywords Paper

Deep Learning - General

0

0

0

0

16:12

04/08/2021

SGD Generalizes Better Than GD (And Regularization Doesn't Help)

Idan Amir, Tomer Koren, Roi Livni

Keywords Paper

0

0

0

0

15:53

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

06/12/2020

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Noam Razin, Nadav Cohen

Keywords Paper

0

0

0

0

3:14