Picking Winning Tickets Before Training by Preserving Gradient Flow

26/04/2020

Picking Winning Tickets Before Training by Preserving Gradient Flow

Chaoqi Wang, Guodong Zhang, Roger Grosse

Keywords: neural network, pruning before training, weight pruning

Abstract Paper Similar Papers

Abstract: Overparameterization has been shown to benefit both the optimization and generalization of neural networks, but large networks are resource hungry at both training and test time. Network pruning can reduce test-time resource requirements, but is typically applied to trained networks and therefore cannot avoid the expensive training process. We aim to prune networks at initialization, thereby saving resources at training time as well. Specifically, we argue that efficient training requires preserving the gradient flow through the network. This leads to a simple but effective pruning criterion we term Gradient Signal Preservation (GraSP). We empirically investigate the effectiveness of the proposed method with extensive experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and ImageNet, using VGGNet and ResNet architectures. Our method can prune 80% of the weights of a VGG-16 network on ImageNet at initialization, with only a 1.6% drop in top-1 accuracy. Moreover, our method achieves significantly better performance than the baseline at extreme sparsity levels. Our code is made public at: https://github.com/alecwangcq/GraSP.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Top-KAST: Top-K Always Sparse Training

Sid Jayakumar, Razvan Pascanu, Jack Rae and
Simon Osindero, Erich Elsen

Keywords Paper

0

0

0

0

3:18

06/12/2021

Boost Neural Networks by Checkpoints

Feng Wang, Guoyizhe Wei, Qiao Liu and
Jinxiang Ou, xian wei, Hairong Lv

Keywords Paper

deep learning

1

0

0

0

4:45

26/04/2020

MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius

Runtian Zhai, Chen Dan, Di He and
Huan Zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, Liwei Wang

Keywords Paper

Adversarial Robustness, Provable Adversarial Defense, Randomized Smoothing, Robustness Certification

0

0

0

0

5:10

06/12/2021

Powerpropagation: A sparsity inducing weight reparameterisation

Jonathan Schwarz, Siddhant M Jayakumar, Razvan Pascanu and
Peter E Latham, Yee Teh

Keywords Paper

deep learning, optimization, continual learning

0

0

0

1

9:08

02/02/2021

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

Xin Chen, Lingxi Xie, Jun Wu and
Longhui Wei, Yuhui Xu, Qi Tian

Keywords Paper

0

0

0

0

15:02

07/09/2020

Towards Convolutional Neural Networks Compression via Global&Progressive Product Quantization

Weihan Chen, Peisong Wang, Jian Cheng

Keywords Paper

convolutional neural network compression, product quantization

0

0

0

0

5:03

06/12/2020

Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search

Houwen Peng, Hao Du, Hongyuan Yu and
QI LI, Jing Liao, Jianlong Fu

Keywords Paper

0

0

0

0

3:12

07/09/2020

Paying more Attention to Snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation

Duong Le, Nhan Vo, Nam Thoai

Keywords Paper

network pruning, knowledge distillation, ensemble learning

0

0

0

0

8:30

02/02/2021

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Xiangyu Chang, Yingcong Li, Samet Oymak, Christos Thrampoulidis

Keywords Paper

0

0

0

0

18:14

22/11/2021

Parameter Efficient Dynamic Convolution via Tensor Decomposition

Zejiang Hou, Sun-Yuan Kung

Keywords Paper

dynamic convolution, input-dependent reparameterization, parameter efficiency, tensor decomposition

0

0

0

0

3:58

14/06/2020

GreedyNAS: Towards Fast One-Shot NAS With Greedy Supernet

Shan You, Tao Huang, Mingmin Yang and
Fei Wang, Chen Qian, Changshui Zhang

Keywords Paper

neural architecture search, supernet, one-shot nas, single path, greedy algorithm, exploration and exploitation, searching efficiency

0

0

0

0

1:01

05/01/2021

Spike-Thrift: Towards Energy-Efficient Deep Spiking Neural Networks by Limiting Spiking Activity via Attention-Guided Compression

Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

Keywords Paper

0

0

0

0

5:22

19/08/2021

EventDrop: Data Augmentation for Event-based Learning

Fuqiang Gu, Weicong Sng, Xuke Hu, Fangwen Yu

Keywords Paper

Computer Vision, Recognition, Classification

0

0

0

0

8:48

26/04/2020

Don't Use Large Mini-batches, Use Local SGD

Tao Lin, Sebastian U. Stich, Kumar Kshitij Patel, Martin Jaggi

Keywords Paper

0

0

0

0

4:36

06/12/2021

Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks

Hassan Dbouk, Naresh Shanbhag

Keywords Paper

deep learning, robustness, adversarial robustness and security

0

0

0

0

15:01

22/11/2021

DISCO: accurate Discrete Scale Convolutions

Ivan Sosnovik, Artem Moskalev, Arnold W.M. Smeulders

Keywords Paper

equivariance, symmetry, invariance, scale, convolutions, dilation, tracking, image classification

0

0

0

0

8:38

12/07/2020

Overfitting in adversarially robust deep learning

Eric Wong, Leslie Rice, Zico Kolter

Keywords Paper

Adversarial Examples

0

0

0

0

14:44

06/12/2021

Sparse Flows: Pruning Continuous-depth Models

Lucas Liebenwein, Ramin Hasani, Alexander Amini, Daniela Rus

Keywords Paper

deep learning, generative model

0

0

0

0

12:51

18/07/2021

Globally-Robust Neural Networks

Klas Leino, Zifan Wang, Matt Fredrikson

Keywords Paper

Social Aspects of Machine Learning, AI Safety

0

0

0

0

7:55

26/04/2020

Adversarially robust transfer learning

Ali Shafahi, Parsa Saadatpanah, Chen Zhu and
Amin Ghiasi, Christoph Studer, David Jacobs, Tom Goldstein

Keywords Paper

0

0

0

0

4:58

06/12/2020

Differentiable Augmentation for Data-Efficient GAN Training

Shengyu Zhao, Zhijian Liu, Ji Lin and
Jun-Yan Zhu, Song Han

Keywords Paper

0

0

0

0

3:22

02/02/2021

Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

0

0

0

0

18:26

12/07/2020

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:21

26/04/2020

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

Dan Hendrycks, Norman Mu, Ekin Dogus Cubuk and
Barret Zoph, Justin Gilmer, Balaji Lakshminarayanan

Keywords Paper

robustness, uncertainty

0

0

0

0

4:38

13/04/2021

Fractional moment-preserving initialization schemes for training deep neural networks

Mert Gurbuzbalaban, Yuanhan Hu

Keywords Paper

0

0

0

0

3:05

07/09/2020

Neural Network Quantization with Scale-Adjusted Training

Qing Jin, Linjie Yang, Zhenyu Liao, Xiaoning Qian

Keywords Paper

neural network quantization, over-fitting, regularization

0

0

0

0

5:02

06/12/2020

Improving Neural Network Training in Low Dimensional Random Bases

Frithjof Gressmann, Zach Eaton-Rosen, Carlo Luschi

Keywords Paper

0

0

0

0

3:01

02/02/2021

Near Lossless Transfer Learning for Spiking Neural Networks

Zhanglu Yan, Jun Zhou, Weng-Fai Wong

Keywords Paper

0

0

0

0

16:34

26/04/2020

Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks

Leopold Cambier, Anahita Bhiwandiwalla, Ting Gong and
Oguz H. Elibol, Mehran Nekuii, Hanlin Tang

Keywords Paper

Low-precision training, numerics, deep learning

0

0

0

0

4:46

26/04/2020

Towards Stable and Efficient Training of Verifiably Robust Neural Networks

Huan Zhang, Hongge Chen, Chaowei Xiao and
Sven Gowal, Robert Stanforth, Bo Li, Duane Boning, Cho-Jui Hsieh

Keywords Paper

Robust Neural Networks, Verifiable Training, Certified Adversarial Defense

0

0

0

0

5:05

03/05/2021

Growing Efficient Deep Networks by Structured Continuous Sparsification

Xin Yuan, Pedro Savarese, Michael Maire

Keywords Paper

network pruning, computer vision, deep learning, neural architecture search

0

0

0

0

16:52

03/05/2021

Distance-Based Regularisation of Deep Networks for Fine-Tuning

Henry Gouk, Timothy Hospedales, massimiliano pontil

Keywords Paper

Statistical Learning Theory, Transfer Learning, Deep Learning

0

0

0

0

4:57

12/07/2020

Analyzing the effect of neural network architecture on training performance

Karthik Abinav Sankararaman, Soham De, Zheng Xu and
W. Ronny Huang, Tom Goldstein

Keywords Paper

Deep Learning - Theory

0

0

0

0

14:03

26/04/2020

Effect of Activation Functions on the Training of Overparametrized Neural Nets

Abhishek Panigrahi, Abhishek Shetty, Navin Goyal

Keywords Paper

activation functions, deep learning theory, neural networks

0

0

0

0

5:13

03/05/2021

Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control

Zhuang Liu, Xuanlin Li, Bingyi Kang, trevor darrell

Keywords Paper

Deep Reinforcement Learning, Regularization, Continuous Control, Policy Optimization

0

0

0

0

8:45

02/02/2021

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu

Keywords Paper

0

0

0

0

14:30

12/07/2020

Towards Accurate Post-training Network Quantization via Bit-Split and Stitching

Peisong Wang, Qiang Chen, Xiangyu He, Jian Cheng

Keywords Paper

Applications - Computer Vision

0

0

0

0

7:58

14/06/2020

Augment Your Batch: Improving Generalization Through Instance Repetition

Elad Hoffer, Tal Ben-Nun, Itay Hubara and
Niv Giladi, Torsten Hoefler, Daniel Soudry

Keywords Paper

generalization, augmentation, regularization, large-batch, deep-learning, convolutional-networks

0

0

0

0

1:00

06/12/2021

EvoGrad: Efficient Gradient-Based Meta-Learning and Hyperparameter Optimization

Ondrej Bohdal, Yongxin Yang, Timothy Hospedales

Keywords Paper

deep learning, optimization, graph learning, meta learning, few shot learning

0

0

0

0

14:09