Paying more Attention to Snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation

07/09/2020

Paying more Attention to Snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation

Duong Le, Nhan Vo, Nam Thoai

Keywords: network pruning, knowledge distillation, ensemble learning

Abstract Paper Code Similar Papers

Abstract: Network pruning is one of the most dominant methods for reducing the heavy inference cost of deep neural networks. Existing methods often iteratively prune networks to attain high compression ratio without incurring significant loss in performance. However, we argue that conventional methods for retraining pruned networks (i.e., using small, fixed learning rate) are inadequate as they completely ignore the benefits from snapshots of iterative pruning. In this work, we show that strong ensembles can be constructed from snapshots of iterative pruning, which achieve competitive performance and vary in network structure. Furthermore, we present a simple, general and effective pipeline that generates strong ensembles of networks during pruning with textit{large learning rate restarting}, and utilizes knowledge distillation with those ensembles to improve the predictive power of compact models. In standard image classification benchmarks such as CIFAR and Tiny-Imagenet, we advance state-of-the-art pruning ratio of structured pruning by integrating simple $ell_1$-norm filters pruning into our pipeline. Specifically, we reduce 75-80% of total parameters and 65-70% MACs of numerous variants of ResNet architectures while having comparable or better performance than that of original networks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at BMVC 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

DPFPS: Dynamic and Progressive Filter Pruning for Compressing Convolutional Neural Networks from Scratch

Xiaofeng Ruan, Yufan Liu, Bing Li and
Chunfeng Yuan, Weiming Hu

Keywords Paper

0

0

0

0

14:38

06/12/2021

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Sheng Liu, Xiao Li, Yuexiang Zhai and
Chong You, Zhihui Zhu, Carlos Fernandez-Granda, Qing Qu

Keywords Paper

deep learning, machine learning, robustness, generative model

0

0

0

0

6:45

12/07/2020

Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks

Mark Kurtz, Justin Kopinsky, Rati Gelashvili and
Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:41

05/01/2021

Spike-Thrift: Towards Energy-Efficient Deep Spiking Neural Networks by Limiting Spiking Activity via Attention-Guided Compression

Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

Keywords Paper

0

0

0

0

5:22

06/12/2020

Top-KAST: Top-K Always Sparse Training

Sid Jayakumar, Razvan Pascanu, Jack Rae and
Simon Osindero, Erich Elsen

Keywords Paper

0

0

0

0

3:18

07/09/2020

Towards Convolutional Neural Networks Compression via Global&Progressive Product Quantization

Weihan Chen, Peisong Wang, Jian Cheng

Keywords Paper

convolutional neural network compression, product quantization

0

0

0

0

5:03

23/08/2020

Rethinking pruning for accelerating deep inference at the edge

Dawei Gao, Xiaoxi He, Zimu Zhou and
Yongxin Tong, Ke Xu, Lothar Thiele

Keywords Paper

automatic speech recognition, deep learning, name entity recognition, network pruning, sequence labelling

0

0

0

0

13:43

14/06/2020

Resolution Adaptive Networks for Efficient Inference

Le Yang, Yizeng Han, Xi Chen and
Shiji Song, Jifeng Dai, Gao Huang

Keywords Paper

adaptive inference, efficient deep learning, multi-scale feature learning, budgeted batch classification

0

0

0

0

0:59

12/07/2020

Operation-Aware Soft Channel Pruning using Differentiable Masks

Minsoo Kang, Bohyung Han

Keywords Paper

Applications - Computer Vision

0

0

0

0

14:56

30/11/2020

Lossless Image Compression Using a Multi-Scale Progressive Statistical Model

Honglei Zhang, Francesco Cricri, Hamed R. Tavakoli and
Nannan Zou, Emre Aksu, Miska M. Hannuksela

Keywords Paper

0

0

0

0

9:33

14/06/2020

Orthogonal Convolutional Neural Networks

Jiayun Wang, Yubei Chen, Rudrasis Chakraborty, Stella X. Yu

Keywords Paper

orthogonal convolution, orthogonality, regularization, filter redundancy, robustness, classification, retrieval, semi-supervised, gans, inpainting

0

0

0

0

1:00

30/11/2020

To filter prune, or to layer prune, that is the question

Sara Elkerdawy, Mostafa Elhoushi, Abhineet Singh and
Hong Zhang, Nilanjan Ray

Keywords Paper

0

0

0

1

9:32

18/07/2021

A Probabilistic Approach to Neural Network Pruning

Xin Qian, Diego Klabjan

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:06

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

06/12/2020

Practical Low-Rank Communication Compression in Decentralized Deep Learning

Thijs Vogels, Sai Praneeth Karimireddy, Martin Jaggi

Keywords Paper

0

0

0

0

3:18

26/04/2020

Dynamic Model Pruning with Feedback

Tao Lin, Sebastian U. Stich, Luis Barba and
Daniil Dmitriev, Martin Jaggi

Keywords Paper

network pruning, dynamic reparameterization, model compression

0

0

0

0

4:30

06/12/2021

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Kuan-Lin Chen, Ching-Hua Lee, Harinath Garudadri, Bhaskar D Rao

Keywords Paper

optimization, vision

0

0

0

0

13:27

06/12/2021

Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices

Aliakbar Panahi, Seyran Saeedi, Tom Arodz

Keywords Paper

transformers

0

0

0

0

13:06

05/01/2021

Exploiting the Redundancy in Convolutional Filters for Parameter Reduction

Kumara Kahatapitiya, Ranga Rodrigo

Keywords Paper

0

0

0

0

5:10

06/12/2021

Boost Neural Networks by Checkpoints

Feng Wang, Guoyizhe Wei, Qiao Liu and
Jinxiang Ou, xian wei, Hairong Lv

Keywords Paper

deep learning

1

0

0

0

4:45

22/11/2021

Parameter Efficient Dynamic Convolution via Tensor Decomposition

Zejiang Hou, Sun-Yuan Kung

Keywords Paper

dynamic convolution, input-dependent reparameterization, parameter efficiency, tensor decomposition

0

0

0

0

3:58

26/04/2020

Picking Winning Tickets Before Training by Preserving Gradient Flow

Chaoqi Wang, Guodong Zhang, Roger Grosse

Keywords Paper

neural network, pruning before training, weight pruning

0

0

0

0

5:02

18/07/2021

Training Adversarially Robust Sparse Networks via Bayesian Connectivity Sampling

Ozan Özdenizci, Robert Legenstein

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

1

6:27

03/05/2021

MetaNorm: Learning to Normalize Few-Shot Batches Across Domains

Yingjun Du, Xiantong Zhen, Ling Shao, Cees G Snoek

Keywords Paper

batch normalization, Meta-learning, few-shot domain generalization

0

0

0

0

5:48

26/04/2020

Minimizing FLOPs to Learn Efficient Sparse Representations

Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen and
Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Keywords Paper

sparse embeddings, deep representations, metric learning, regularization

0

0

0

0

4:41

05/01/2021

Dynamic Routing Networks

Shaofeng Cai, Yao Shu, Wei Wang

Keywords Paper

0

0

0

0

4:52

05/04/2021

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Paper

0

0

0

0

22:37

05/04/2021

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Paper

0

0

0

0

4:13

05/01/2021

OverNet: Lightweight Multi-Scale Super-Resolution With Overscaling Network

Parichehr Behjati, Pau Rodriguez, Armin Mehri and
Isabelle Hupont, Carles Fernandez Tena, Jordi Gonzalez

Keywords Paper

0

0

0

0

4:24

06/12/2020

Finite Versus Infinite Neural Networks: an Empirical Study

Jaehoon Lee, Sam Schoenholz, Jeffrey Pennington and
Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein

Keywords Paper

0

0

0

0

3:27

19/08/2021

Decomposable-Net: Scalable Low-Rank Compression for Neural Networks

Atsushi Yaguchi, Taiji Suzuki, Shuhei Nitta and
Yukinobu Sakata, Akiyuki Tanizawa

Keywords Paper

Machine Learning, Deep Learning, Statistical Methods and Machine Learning, Recognition, 2D and 3D Computer Vision

0

0

0

0

10:40

02/02/2021

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Xiangyu Chang, Yingcong Li, Samet Oymak, Christos Thrampoulidis

Keywords Paper

0

0

0

0

18:14

06/12/2021

DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Wei Sun, Aojun Zhou, Sander Stuijk and
Rob Wijnhoven, Andrew Nelson, hongsheng Li, Henk Corporaal

Keywords Paper

deep learning

0

0

0

0

15:07

14/06/2020

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

Qilong Wang, Banggu Wu, Pengfei Zhu and
Peihua Li, Wangmeng Zuo, Qinghua Hu

Keywords Paper

channel attention, efficient, adaptive 1d convolution, deep cnns, image classifcation, object detection, instance segmentation

0

0

0

0

0:57

06/12/2021

DeepReduce: A Sparse-tensor Communication Framework for Federated Deep Learning

Hang Xu, Kelly Kostopoulou, Aritra Dutta and
Xin Li, Alexandros Ntoulas, Panos Kalnis

Keywords Paper

deep learning, federated learning

0

0

0

0

12:15

03/05/2021

Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control

Zhuang Liu, Xuanlin Li, Bingyi Kang, trevor darrell

Keywords Paper

Deep Reinforcement Learning, Regularization, Continuous Control, Policy Optimization

0

0

0

0

8:45

06/12/2021

Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks

Melih Barsbey, Milad Sefidgaran, Murat Erdogdu and
Gaël Richard, Umut Simsekli

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:25

18/07/2021

Group Fisher Pruning for Practical Network Compression

Liyang Liu, Shilong Zhang, Zhanghui Kuang and
Aojun Zhou, Jing-Hao Xue, Xinjiang Wang, Yimin Chen, Wenming Yang, Qingmin Liao, Wayne Zhang

Keywords Paper

Applications, Computer Vision

0

0

0

0

5:05

30/11/2020

Towards Optimal Filter Pruning with Balanced Performance and Pruning Speed

Dong Li, Sitong Chen, Xudong Liu and
Yunda Sun, Li Zhang

Keywords Paper

0

0

0

0

10:28

26/04/2020

Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin

Colin Wei, Tengyu Ma

Keywords Paper

deep learning theory, generalization bounds, adversarially robust generalization, data-dependent generalization bounds

0

0

0

0

5:30