Implicit Bias of Gradient Descent based Adversarial Training on Separable Data

26/04/2020

Implicit Bias of Gradient Descent based Adversarial Training on Separable Data

Yan Li, Ethan X.Fang, Huan Xu, Tuo Zhao

Keywords: implicit bias, adversarial training, robustness, gradient descent

Abstract Paper Similar Papers

Abstract: Adversarial training is a principled approach for training robust neural networks. Despite of tremendous successes in practice, its theoretical properties still remain largely unexplored. In this paper, we provide new theoretical insights of gradient descent based adversarial training by studying its computational properties, specifically on its implicit bias. We take the binary classification task on linearly separable data as an illustrative example, where the loss asymptotically attains its infimum as the parameter diverges to infinity along certain directions. Specifically, we show that for any fixed iteration $T$, when the adversarial perturbation during training has proper bounded L2 norm, the classifier learned by gradient descent based adversarial training converges in direction to the maximum L2 norm margin classifier at the rate of $O(1/\sqrt{T})$, significantly faster than the rate $O(1/\log T}$ of training with clean data. In addition, when the adversarial perturbation during training has bounded Lq norm, the resulting classifier converges in direction to a maximum mixed-norm margin classifier, which has a natural interpretation of robustness, as being the maximum L2 norm margin classifier under worst-case bounded Lq norm perturbation to the data. Our findings provide theoretical backups for adversarial training that it indeed promotes robustness against adversarial perturbation.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/04/2020

Effect of Activation Functions on the Training of Overparametrized Neural Nets

Abhishek Panigrahi, Abhishek Shetty, Navin Goyal

Keywords Paper

activation functions, deep learning theory, neural networks

0

0

0

0

5:13

02/02/2021

Fast and Scalable Adversarial Training of Kernel SVM via Doubly Stochastic Gradients

Huimin Wu, Zhengmian Hu, Bin Gu

Keywords Paper

0

0

0

0

14:04

06/12/2021

Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent

Spencer Frei, Quanquan Gu

Keywords Paper

deep learning, optimization

0

0

0

0

10:33

06/12/2020

A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks

Zixiang Chen, Yuan Cao, Quanquan Gu, Tong Zhang

Keywords Paper

0

0

0

0

3:16

18/07/2021

On the Explicit Role of Initialization on the Convergence and Implicit Bias of Overparametrized Linear Networks

Hancheng Min, Salma Tarmoun, Rene Vidal, Enrique Mallada

Keywords Paper

Theory

0

0

0

0

5:16

18/07/2021

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

Bohan Wang, Qi Meng, Wei Chen, Tie-Yan Liu

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

16:53

14/06/2020

On the Acceleration of Deep Learning Model Parallelism With Staleness

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

layer-wise staleness, asynchronous model parallelism, convolutional neural networks.

0

0

0

0

1:01

26/04/2020

Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity

Jingzhao Zhang, Tianxing He, Suvrit Sra, Ali Jadbabaie

Keywords Paper

Adaptive methods, optimization, deep learning

1

0

0

0

14:15

06/12/2021

Second-Order Neural ODE Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Paper

deep learning, optimization, machine learning, vision

0

0

0

0

14:59

06/12/2020

Improved Analysis of Clipping Algorithms for Non-convex Optimization

Bohang Zhang, Jikai Jin, Cong Fang, Liwei Wang

Keywords Paper

0

0

0

0

3:16

06/12/2020

Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

Yi Zhang, Orestis Plevrakis, Simon Du and
Xingguo Li, Zhao Song, Sanjeev Arora

Keywords Paper

0

0

0

0

2:56

06/12/2021

Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Tolga Birdal, Aaron Lou, Leonidas Guibas, Umut Simsekli

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:38

03/05/2021

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

Zixiang Chen, Yuan Cao, Difan Zou, Quanquan Gu

Keywords Paper

classification, neural tangent kernel, generalization error, (stochastic) gradient descent, deep ReLU networks

0

0

0

0

4:44

12/07/2020

A Mean Field Analysis Of Deep ResNet And Beyond: Towards Provably Optimization Via Overparameterization From Depth

Yiping Lu, Chao Ma, Yulong Lu and
Jianfeng Lu, Lexing Ying

Keywords Paper

Deep Learning - Theory

0

0

0

0

4:37

06/12/2020

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations

Zhuoran Yang, Chi Jin, Zhaoran Wang and
Mengdi Wang, Michael Jordan

Keywords Paper

0

0

0

0

3:42

26/04/2020

Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets

Mingrui Liu, Youssef Mroueh, Jerret Ross and
Wei Zhang, Xiaodong Cui, Payel Das, Tianbao Yang

Keywords Paper

Generative Adversarial Nets, Adaptive Gradient Algorithms

0

0

0

0

5:08

09/07/2020

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

Lénaïc Chizat, Francis Bach

Keywords Paper

Neural networks/deep learning, Non-convex optimization

0

0

0

0

14:41

26/04/2020

Finite Depth and Width Corrections to the Neural Tangent Kernel

Boris Hanin, Mihai Nica

Keywords Paper

Neural Tangent Kernel, Finite Width Corrections, Random ReLU Net, Wide Networks, Deep Networks

0

0

0

0

5:09

06/12/2020

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

Jiawei Zhang, Peijun Xiao, Ruoyu Sun, Zhiquan Luo

Keywords Paper

0

0

0

0

3:12

02/02/2021

Large Batch Optimization for Deep Learning Using New Complete Layer-Wise Adaptive Rate Scaling

Zhouyuan Huo, Bin Gu, Heng Huang

Keywords Paper

0

0

0

0

15:17

02/02/2021

Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

0

0

0

0

18:26

06/12/2020

Towards Better Generalization of Adaptive Gradient Methods

Yingxue Zhou, Belhal Karimi, Jinxing Yu and
Zhiqiang Xu, Ping Li

Keywords Paper

0

0

0

0

3:21

06/12/2020

Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks

Ryo Karakida, Kazuki Osawa

Keywords Paper

0

0

0

0

3:19

18/07/2021

Train simultaneously, generalize better: Stability of gradient-based minimax learners

Farzan Farnia, Asuman Ozdaglar

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:16

06/12/2021

Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias

Kaifeng Lyu, Zhiyuan Li, Runzhe Wang, Sanjeev Arora

Keywords Paper

deep learning, optimization, machine learning

0

0

0

0

14:56

02/02/2021

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

0

0

0

0

16:42

26/08/2020

Adversarial Risk Bounds through Sparsity based Compression

Emilio Balda, Niklas Koep, Arash Behboodi, Rudolf Mathar

Keywords Paper

0

0

0

0

15:15

04/08/2021

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

0

20:19

06/12/2021

Robust Implicit Networks via Non-Euclidean Contractions

Saber Jafarpour, Alexander Davydov, Anton Proskurnikov, Francesco Bullo

Keywords Paper

theory, deep learning, machine learning, robustness, vision

0

0

0

0

14:59

14/06/2020

Towards Unified INT8 Training for Convolutional Neural Network

Feng Zhu, Ruihao Gong, Fengwei Yu and
Xianglong Liu, Yanfei Wang, Zhelong Li, Xiuqi Yang, Junjie Yan

Keywords Paper

int8 training, gradient quantization, direction sensitive gradient clipping, learning rate scaling, gradient distribution

0

0

0

0

1:01

12/07/2020

Training Binary Neural Networks using the Bayesian Learning Rule

Xiangming Meng, Roman Bachmann, Mohammad Emtiyaz Khan

Keywords Paper

Deep Learning - General

0

0

0

0

10:27

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05

03/05/2021

Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods

Taiji Suzuki, Akiyama Shunta

Keywords Paper

local Rademacher complexity, minimax optimal rate, Excess risk, linear estimator, kernel method, fast learning rate

0

0

0

0

10:13

26/08/2020

Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training

Fangda Gu, Armin Askari, Laurent El Ghaoui

Keywords Paper

0

0

0

0

14:27

26/04/2020

Escaping Saddle Points Faster with Stochastic Momentum

Jun-Kun Wang, Chi-Heng Lin, Jacob Abernethy

Keywords Paper

SGD, momentum, escaping saddle point

0

0

0

0

5:26

06/12/2020

Maximum-Entropy Adversarial Data Augmentation for Improved Generalization and Robustness

Long Zhao, Ting Liu, Xi Peng, Dimitris Metaxas

Keywords Paper

0

0

0

0

3:22

26/04/2020

Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks

Wei Hu, Lechao Xiao, Jeffrey Pennington

Keywords Paper

deep learning theory, non-convex optimization, orthogonal initialization

0

0

0

0

5:10

03/05/2021

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

Aojun Zhou, Yukun Ma, Junnan Zhu and
Jianbo Liu, Zhijie Zhang, Kun Yuan, Wenxiu Sun, Hongsheng Li

Keywords Paper

sparsity, efficient training and inference.

0

0

0

0

5:09

03/05/2021

Global Convergence of Three-layer Neural Networks in the Mean Field Regime

Huy Tuan Pham, Phan-Minh Nguyen

Keywords Paper

deep learning theory

0

0

0

0

15:41

12/07/2020

Deep Reinforcement Learning with Smooth Policy

Qianli Shen, Yan Li, Haoming Jiang and
Zhaoran Wang, Tuo Zhao

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

9:51