The Generalization-Stability Tradeoff In Neural Network Pruning

Abstract: Pruning neural network parameters is often viewed as a means to compress models, but pruning has also been motivated by the desire to prevent overfitting. This motivation is particularly relevant given the perhaps surprising observation that a wide variety of pruning approaches increase test accuracy despite sometimes massive reductions in parameter counts. To better understand this phenomenon, we analyze the behavior of pruning over the course of training, finding that pruning's benefit to generalization increases with pruning's instability (defined as the drop in test accuracy immediately following pruning). We demonstrate that this "generalization-stability tradeoff'' is present across a wide variety of pruning settings and propose a mechanism for its cause: pruning regularizes similarly to noise injection. Supporting this, we find less pruning stability leads to more model flatness and the benefits of pruning do not depend on permanent parameter removal. These results explain the compatibility of pruning-based generalization improvements and the high generalization recently observed in overparameterized networks.

12/07/2020

The Generalization-Stability Tradeoff In Neural Network Pruning

Brian Bartoldson, Ari Morcos, Adrian Barbu, Gordon Erlebacher

Comments

Similar Papers

Adversarial Filters of Dataset Biases

Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula and Rowan Zellers, Matthew Peters, Ashish Sabharwal, Yejin Choi

Keywords Abstract Paper

Deep Learning - Algorithms

Overparameterisation and worst-case generalisation: friend or foe?

Aditya Krishna Menon, Ankit Singh Rawat, Sanjiv Kumar

Keywords Abstract Paper

worst-case generalisation, overparameterisation

Regularizing Class-Wise Predictions via Self-Knowledge Distillation

Sukmin Yun, Jongjin Park, Kimin Lee, Jinwoo Shin

Keywords Abstract Paper

image classification, regularization, self-knowledge distillation, generalization, calibration

Towards a better understanding of label smoothing in neural machine translation

Yingbo Gao, Weiyue Wang, Christian Herold and Zijian Yang, Hermann Ney

Keywords Abstract Paper

Analyzing the effect of neural network architecture on training performance

Karthik Abinav Sankararaman, Soham De, Zheng Xu and W. Ronny Huang, Tom Goldstein

Keywords Abstract Paper

Deep Learning - Theory

Nondeterminism and Instability in Neural Network Optimization

Cecilia Summers, Michael J Dinneen

Keywords Abstract Paper

Deep Learning, Optimization for Deep Networks

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Abstract Paper

Deep Learning - Algorithms

A Gradient Flow Framework For Analyzing Network Pruning

Ekdeep Lubana, Robert Dick

Keywords Abstract Paper

Early pruning, Gradient flow, Network pruning

On Warm-Starting Neural Network Training

Jordan Ash, Ryan Adams

Keywords Abstract Paper

Consistency Regularization for Certified Robustness of Smoothed Classifiers

Jongheon Jeong, Jinwoo Shin

Keywords Abstract Paper

Transferring Pretrained Networks to Small Data via Category Decorrelation

Ying Jin, Zhangjie Cao, Mingsheng Long, Jianmin Wang

Keywords Abstract Paper

Category Decorrelation, Under Transfer

Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization

Neha Wadia, Daniel Duckworth, Samuel Schoenholz and Ethan Dyer, Jascha Sohl-Dickstein

Keywords Abstract Paper

Optimization, Probabilistic Methods, Topic Models, Probabilistic Methods, Latent Variable Models

Early Stopping in Deep Networks: Double Descent and How to Eliminate it

Reinhard Heckel, Fatih Furkan Yilmaz

Keywords Abstract Paper

early stopping, double descent

Regularization via Structural Label Smoothing

Weizhi Li, Gautam Dasarathy, Visar Berisha

Keywords Abstract Paper

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu

Keywords Abstract Paper

Learning to Forget for Meta-Learning

Sungyong Baik, Seokil Hong, Kyoung Mu Lee

Keywords Abstract Paper

meta learning, few-shot learning, reinforcement learning

When Are Solutions Connected in Deep Networks?

Quynh Nguyen, Pierre Bréchet, Marco Mondelli

Keywords Abstract Paper

theory, deep learning, optimization

Time-Consistent Self-Supervision for Semi-Supervised Learning

Tianyi Zhou, Shengjie Wang, Jeff Bilmes

Keywords Abstract Paper

Unsupervised and Semi-Supervised Learning

RATT: Leveraging Unlabeled Data to Guarantee Generalization

Saurabh Garg, Sivaraman Balakrishnan, Zico Kolter, Zachary Lipton

Keywords Abstract Paper

Probabilistic Methods, Graphical Models, Theory, Computational Complexity, Theory, Models of Learning and Generalization

On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them

Chen Liu, Mathieu Salzmann, Tao Lin and Ryota Tomioka, Sabine Süsstrunk

Keywords Abstract Paper

Algorithms -> Representation Learning, Applications -> Dialog- or Communication-Based Learning

Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks

Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula and
Rowan Zellers, Matthew Peters, Ashish Sabharwal, Yejin Choi

Keywords Paper

Keywords Paper

Keywords Paper

Yingbo Gao, Weiyue Wang, Christian Herold and
Zijian Yang, Hermann Ney

Keywords Paper

Karthik Abinav Sankararaman, Soham De, Zheng Xu and
W. Ronny Huang, Tom Goldstein

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Neha Wadia, Daniel Duckworth, Samuel Schoenholz and
Ethan Dyer, Jascha Sohl-Dickstein

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Chen Liu, Mathieu Salzmann, Tao Lin and
Ryota Tomioka, Sabine Süsstrunk

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Hao Cheng, Zhaowei Zhu, Xingyu Li and
Yifei Gong, Xing Sun, Yang Liu

Keywords Paper

Keywords Paper

Junhyun Nam, Hyuntak Cha, Sungsoo Ahn and
Jaeho Lee, Jinwoo Shin

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jeet Mohapatra, Ching-Yun Ko, Lily Weng and
Pin-Yu Chen, Sijia Liu, Luca Daniel

Keywords Paper

Nicolas Papernot, Abhradeep Thakurta, Shuang Song and
Steve Chien, Úlfar Erlingsson

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper