When does gradient descent with logistic loss interpolate using deep networks with smoothed ReLU activations?

04/08/2021

When does gradient descent with logistic loss interpolate using deep networks with smoothed ReLU activations?

Niladri S Chatterji, Philip M Long, Peter Bartlett

Keywords:

Abstract Paper Similar Papers

Abstract: We establish conditions under which gradient descent applied to fixed-width deep networks drives the logistic loss to zero, and prove bounds on the rate of convergence. Our analysis applies for smoothed approximations to the ReLU, such as Swish and the Huberized ReLU, proposed in previous applied work. We provide two sufficient conditions for convergence. The first is simply a bound on the loss at initialization. The second is a data separation condition used in prior analyses.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at COLT 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/04/2020

Gradient Descent Maximizes the Margin of Homogeneous Neural Networks

Kaifeng Lyu, Jian Li

Keywords Paper

margin, homogeneous, gradient descent

0

0

0

0

15:02

03/08/2020

Bounding the expected run-time of nonconvex optimization with early stopping

Thomas Flynn, Kwangmin Yu, Abid Malik and
Nicholas D’Imperio, Shinjae Yoo

Keywords Paper

0

0

0

0

8:12

06/12/2020

Sinkhorn Barycenter via Functional Gradient Descent

Zebang Shen, Zhenfu Wang, Alejandro Ribeiro, Hamed Hassani

Keywords Paper

0

0

0

1

3:14

03/05/2021

The inductive bias of ReLU networks on orthogonally separable data

Mary Phuong, Christoph H Lampert

Keywords Paper

implicit bias, extremal sector, gradient descent, inductive bias, max-margin, ReLU networks

0

0

0

0

5:00

09/07/2020

Winnowing with Gradient Descent

Ehsan Amid, Manfred K. Warmuth

Keywords Paper

Online learning,

0

0

0

0

14:22

12/07/2020

Eliminating the Invariance on the Loss Landscape of Linear Autoencoders

Reza Oftadeh, Jiayi Shen, Zhangyang Wang, Dylan Shell

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:10

06/12/2020

Second Order Optimality in Decentralized Non-Convex Optimization via Perturbed Gradient Tracking

Isidoros Tziotis, Constantine Caramanis, Aryan Mokhtari

Keywords Paper

Applications -> Computer Vision; Deep Learning -> Deep Autoencoders; Deep Learning -> Generative Models; Probabilistic Methods , Applications

0

0

0

0

3:37

26/08/2020

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare

Keywords Paper

0

0

0

0

15:15

06/12/2021

Time-independent Generalization Bounds for SGLD in Non-convex Settings

Tyler Farghly, Patrick Rebeschini

Keywords Paper

optimization

0

0

0

0

9:07

04/08/2021

Non-asymptotic approximations of neural networks by Gaussian processes

Ronen Eldan, Dan Mikulincer, Tselil Schramm

Keywords Paper

0

0

0

0

13:55

06/12/2020

On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems

Panayotis Mertikopoulos, Nadav Hallak, Ali Kavis, Volkan Cevher

Keywords Paper

0

0

0

0

3:27

02/02/2021

On Convergence of Gradient Expected Sarsa(λ)

Long Yang, Gang Zheng, Yu Zhang and
Qian Zheng, Pengfei Li, Gang Pan

Keywords Paper

0

0

0

0

11:27

12/07/2020

Constructive universal distribution generation through deep ReLU networks

Dmytro Perekrestenko, Stephan Müller, Helmut Bölcskei

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:27

12/07/2020

Adaptive Gradient Descent without Descent

Konstantin Mishchenko, Yura Malitsky

Keywords Paper

Optimization - Convex

0

0

0

0

15:40

26/08/2020

Linear Convergence of Adaptive Stochastic Gradient Descent

Yuege Xie, Xiaoxia Wu, Rachel Ward

Keywords Paper

0

0

0

0

10:02

13/04/2021

On riemannian stochastic approximation schemes with fixed step-size

Alain Durmus, Pablo Jiménez, Eric Moulines, Salem SAID

Keywords Paper

0

0

0

0

3:30

06/12/2020

Penalized Langevin dynamics with vanishing penalty for smooth and log-concave targets

Avetik Karagulyan, Arnak Dalalyan

Keywords Paper

0

0

0

0

2:53

06/12/2020

A new convergent variant of Q-learning with linear function approximation

Diogo Carvalho, Francisco S. Melo, Pedro A. Santos

Keywords Paper

0

0

0

0

2:30

26/08/2020

Langevin Monte Carlo without smoothness

Niladri Chatterji, Jelena Diakonikolas, Michael Jordan, Peter Bartlett

Keywords Paper

0

0

0

0

15:02

13/04/2021

On the convergence of gradient descent in GANs: MMD GAN as a gradient flow

Youssef Mroueh, Truyen Nguyen

Keywords Paper

0

0

0

0

2:52

13/04/2021

SGD for structured nonconvex functions: Learning rates, minibatching and interpolation

Robert Gower, Othmane Sebbouh, Nicolas Loizou

Keywords Paper

0

0

0

0

3:07

04/08/2021

On the (asymptotic) convergence of Stochastic Gradient Descent and Stochastic Heavy Ball

Othmane Sebbouh, Robert M Gower, Aaron Defazio

Keywords Paper

0

0

0

0

15:29

12/07/2020

Finite-Time Convergence in Continuous-Time Optimization

Orlando Romero, mouhacine Benosman

Keywords Paper

Optimization - General

0

0

0

0

14:15

12/07/2020

Optimal Bounds between f-Divergences and Integral Probability Metrics

Rohit Agrawal, Thibaut Horel

Keywords Paper

Learning Theory

0

0

0

0

13:49

26/08/2020

A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent

Eduard Gorbunov, Filip Hanzely, Peter Richtarik

Keywords Paper

0

0

0

0

13:13

06/12/2021

A first-order primal-dual method with adaptivity to local smoothness

Maria-Luiza Vladarean, Yura Malitsky, Volkan Cevher

Keywords Paper

optimization

0

0

0

0

11:47

06/12/2020

Directional convergence and alignment in deep learning

Ziwei Ji, Matus Telgarsky

Keywords Paper

0

0

0

0

3:21

12/07/2020

Stochastic Gradient and Langevin Processes

Xiang Cheng, Dong Yin, Peter Bartlett, Michael Jordan

Keywords Paper

Probabilistic Inference - Approximate, Monte Carlo, and Spectral Methods

0

0

0

0

14:55

04/08/2021

Convergence rates and approximation results for SGD and its continuous-time counterpart

Xavier Fontaine, Valentin De Bortoli, Alain Durmus

Keywords Paper

0

0

0

0

17:35

06/12/2021

Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity

Nicolas Loizou, Hugo Berard, Gauthier Gidel and
Ioannis Mitliagkas, Simon Lacoste-Julien

Keywords Paper

optimization

0

0

0

0

15:44

06/12/2021

Automatic and Harmless Regularization with Constrained and Lexicographic Optimization: A Dynamic Barrier Approach

Chengyue Gong, Xingchao Liu, Qiang Liu

Keywords Paper

optimization, machine learning, graph learning

0

0

0

0

15:32

12/07/2020

Inexact Tensor Methods with Dynamic Accuracies

Nikita Doikov, Yurii Nesterov

Keywords Paper

Optimization - Convex

0

0

0

0

15:56

18/07/2021

The Heavy-Tail Phenomenon in SGD

Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu

Keywords Paper

Optimization, Stochastic Optimization

0

0

0

0

5:37

12/07/2020

Super-efficiency of automatic differentiation for functions defined as a minimum

Pierre Ablin, Gabriel Peyré, Thomas Moreau

Keywords Paper

Optimization - General

0

0

0

0

14:46

12/07/2020

Continuous-time Lower Bounds for Gradient-based Algorithms

Michael Muehlebach, Michael Jordan

Keywords Paper

Optimization - Convex

0

0

0

0

15:47

06/12/2020

The Wasserstein Proximal Gradient Algorithm

Adil Salim, Anna Korba, Giulia Luise

Keywords Paper

0

0

0

0

3:14

03/05/2021

Large-width functional asymptotics for deep Gaussian neural networks

Daniele Bracale, Stefano Favaro, Sandra Fortini, Stefano Peluchetti

Keywords Paper

deep learning theory, stochastic process, Gaussian process, infinitely wide neural network

0

0

0

0

4:48

06/12/2020

Implicit Regularization and Convergence for Weight Normalization

Xiaoxia (Shirley) Wu, Edgar Dobriban, Tongzheng Ren and
Shanshan Wu, Zhiyuan Li, Suriya Gunasekar, Rachel Ward, Qiang Liu

Keywords Paper

0

0

0

0

3:31

06/12/2021

On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning

Alireza Fallah, Kristian Georgiev, Aryan Mokhtari, Asuman Ozdaglar

Keywords Paper

theory, optimization, reinforcement learning and planning, meta learning

0

1

1

0

12:25

26/04/2020

On the Global Convergence of Training Deep Linear ResNets

Difan Zou, Philip M. Long, Quanquan Gu

Keywords Paper

0

0

0

0

4:56