Structured Multi-Hashing for Model Compression

14/06/2020

Structured Multi-Hashing for Model Compression

Elad Eban, Yair Movshovitz-Attias, Hao Wu, Mark Sandler, Andrew Poon, Yerlan Idelbayev, Miguel Á. Carreira-Perpiñán

Keywords: compression, weight hashing, on device

Abstract Paper Similar Papers

Abstract: Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this limitation by reducing the memory footprint, latency, or energy consumption of a model with minimal impact on accuracy. We focus on the task of reducing the number of learnable variables in the model. In this work we combine ideas from weight hashing and dimensionality reductions resulting in a simple and powerful structured multi-hashing method based on matrix products that allows direct control of model size of any deep network and is trained end-to-end. We demonstrate the strength of our approach by compressing models from the ResNet, EfficientNet, and MobileNet architecture families. Our method allows us to drastically decrease the number of variables while maintaining high accuracy. For instance, by applying our approach to EfficentNet-B4 (16M parameters) we reduce it to the size of B0 (5M parameters), while gaining over 3% in accuracy over B0 baseline. On the commonly used benchmark CIFAR10 we reduce the ResNet32 model by 75% with no loss in quality, and are able to do a 10x compression while still achieving above 90% accuracy.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CVPR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

19/08/2021

Decomposable-Net: Scalable Low-Rank Compression for Neural Networks

Atsushi Yaguchi, Taiji Suzuki, Shuhei Nitta and
Yukinobu Sakata, Akiyuki Tanizawa

Keywords Paper

Machine Learning, Deep Learning, Statistical Methods and Machine Learning, Recognition, 2D and 3D Computer Vision

0

0

0

0

10:40

05/01/2021

Spike-Thrift: Towards Energy-Efficient Deep Spiking Neural Networks by Limiting Spiking Activity via Attention-Guided Compression

Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

Keywords Paper

0

0

0

0

5:22

12/07/2020

Boosting Deep Neural Network Efficiency with Dual-Module Inference

Liu Liu, Lei Deng, Zhaodong Chen and
yuke wang, Shuangchen Li, Jingwei Zhang, Yihua Yang, Zhenyu Gu, Yufei Ding, Yuan Xie

Keywords Paper

Deep Learning - General

0

0

0

0

8:04

06/12/2021

Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices

Aliakbar Panahi, Seyran Saeedi, Tom Arodz

Keywords Paper

transformers

0

0

0

0

13:06

12/07/2020

DropNet: Reducing Neural Network Complexity via Iterative Pruning

Chong Min John Tan, Mehul Motani

Keywords Paper

Deep Learning - General

0

0

0

0

15:13

05/01/2021

OverNet: Lightweight Multi-Scale Super-Resolution With Overscaling Network

Parichehr Behjati, Pau Rodriguez, Armin Mehri and
Isabelle Hupont, Carles Fernandez Tena, Jordi Gonzalez

Keywords Paper

0

0

0

0

4:24

06/12/2021

Memory-efficient Patch-based Inference for Tiny Deep Learning

Ji Lin, Wei-Ming Chen, Han Cai and
Chuang Gan, Song Han

Keywords Paper

deep learning, machine learning, vision

0

0

0

0

11:14

06/12/2020

Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

Mao Ye, lemon woo, Qiang Liu

Keywords Paper

0

0

0

0

3:14

06/12/2021

Aligned Structured Sparsity Learning for Efficient Image Super-Resolution

Yulun Zhang, Huan Wang, Can Qin, Yun Fu

Keywords Paper

deep learning

0

0

0

0

13:23

22/11/2021

Learning to Sparsify Differences of Synaptic Signal for Efficient Event Processing

Yusuke Sekikawa, Keisuke Uto

Keywords Paper

event-based processing, sigma-delta neuron, temporally sparse processing, learning to sparsify

0

0

0

0

9:46

02/02/2021

OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization

Peng Hu, Xi Peng, Hongyuan Zhu and
Mohamed M. Sabry Aly, Jie Lin

Keywords Paper

0

0

0

0

13:26

23/08/2020

Rethinking pruning for accelerating deep inference at the edge

Dawei Gao, Xiaoxi He, Zimu Zhou and
Yongxin Tong, Ke Xu, Lothar Thiele

Keywords Paper

automatic speech recognition, deep learning, name entity recognition, network pruning, sequence labelling

0

0

0

0

13:43

07/09/2020

Paying more Attention to Snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation

Duong Le, Nhan Vo, Nam Thoai

Keywords Paper

network pruning, knowledge distillation, ensemble learning

0

0

0

0

8:30

12/07/2020

Network Pruning by Greedy Subnetwork Selection

Mao Ye, Chengyue Gong, Lizhen Nie and
Denny Zhou, Adam Klivans, Qiang Liu

Keywords Paper

Deep Learning - General

0

0

0

0

10:01

12/07/2020

Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks

Mark Kurtz, Justin Kopinsky, Rati Gelashvili and
Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:41

05/04/2021

Doping: A technique for Extreme Compression of LSTM Models using Sparse Structured Additive Matrices

Urmish Thakker, Paul Whatmough, ZHIGANG LIU and
Matthew Mattina, Jesse Beu

Keywords Paper

0

0

0

0

19:07

05/04/2021

Doping: A technique for Extreme Compression of LSTM Models using Sparse Structured Additive Matrices

Urmish Thakker, Paul Whatmough, ZHIGANG LIU and
Matthew Mattina, Jesse Beu

Keywords Paper

0

0

0

0

4:22

14/06/2020

Resolution Adaptive Networks for Efficient Inference

Le Yang, Yizeng Han, Xi Chen and
Shiji Song, Jifeng Dai, Gao Huang

Keywords Paper

adaptive inference, efficient deep learning, multi-scale feature learning, budgeted batch classification

0

0

0

0

0:59

03/05/2021

Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

Yulin Wang, Zanlin Ni, Shiji Song and
Le Yang, Gao Huang

Keywords Paper

Deep learning, Locally supervised training

1

0

0

1

5:03

07/09/2020

STQ-Nets: Unifying Network Binarization and Structured Pruning

Aurobindo Munagala, Ameya Prabhu, Anoop Namboodiri

Keywords Paper

quantization, binary networks, binarization, pruning, compression, inference

0

0

0

0

5:19

06/12/2020

Top-KAST: Top-K Always Sparse Training

Sid Jayakumar, Razvan Pascanu, Jack Rae and
Simon Osindero, Erich Elsen

Keywords Paper

0

0

0

0

3:18

06/12/2020

RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference

Oindrila Saha, Aditya Kusupati, Harsha Simhadri and
Manik Varma, Prateek Jain

Keywords Paper

0

0

0

0

3:30

14/06/2020

Adaptive Loss-Aware Quantization for Multi-Bit Networks

Zhongnan Qu, Zimu Zhou, Yun Cheng, Lothar Thiele

Keywords Paper

quantization, binary neural networks, adaptive bitwidth, loss-aware

0

0

0

0

1:01

06/12/2021

AC-GC: Lossy Activation Compression with Guaranteed Convergence

R David Evans, Tor Aamodt

Keywords Paper

deep learning, optimization, graph learning

0

0

0

0

14:39

06/12/2021

Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks

Melih Barsbey, Milad Sefidgaran, Murat Erdogdu and
Gaël Richard, Umut Simsekli

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:25

03/05/2021

WrapNet: Neural Net Inference with Ultra-Low-Precision Arithmetic

Renkun Ni, Hong-Min Chu, Oscar Castaneda and
Ping-yeh Chiang, Christoph Studer, Tom Goldstein

Keywords Paper

efficient inference, quantization

0

0

0

0

5:11

05/01/2021

A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition

Ayush Srivastava, Oshin Dutta, Jigyasa Gupta and
Sumeet Agarwal, Prathosh AP

Keywords Paper

0

0

0

0

4:29

03/05/2021

Learnable Embedding sizes for Recommender Systems

Siyi Liu, Chen Gao, Yihong Chen and
Depeng Jin, Yong Li

Keywords Paper

Deep Learning, Embedding Size, Recommender Systems

0

0

0

0

5:29

05/04/2021

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Steve Dai, Rangha Venkatesan, Mark Ren and
Brian Zimmer, William Dally, Brucek Khailany

Keywords Paper

Deep Learning -> Generative Models, Algorithms -> Similarity and Distance Learning

0

0

0

0

19:08

05/04/2021

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Steve Dai, Rangha Venkatesan, Mark Ren and
Brian Zimmer, William Dally, Brucek Khailany

Keywords Paper

Deep Learning -> Generative Models, Algorithms -> Similarity and Distance Learning

0

0

0

0

5:01

14/06/2020

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

Qilong Wang, Banggu Wu, Pengfei Zhu and
Peihua Li, Wangmeng Zuo, Qinghua Hu

Keywords Paper

channel attention, efficient, adaptive 1d convolution, deep cnns, image classifcation, object detection, instance segmentation

0

0

0

0

0:57

14/06/2020

Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer

Yerlan Idelbayev, Miguel Á. Carreira-Perpiñán

Keywords Paper

low-rank compression, rank selection, optimization, discrete-continuous optimization

0

0

0

0

1:00

30/11/2020

Lightweight Single-Image Super-Resolution Network with Attentive Auxiliary Feature Learning

Xuehui Wang, qing wang, Yuzhi Zhao and
Junchi Yan, Lei Fan, long chen

Keywords Paper

0

0

0

0

9:27

26/04/2020

Minimizing FLOPs to Learn Efficient Sparse Representations

Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen and
Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Keywords Paper

sparse embeddings, deep representations, metric learning, regularization

0

0

0

0

4:41

06/12/2021

Heavy Ball Neural Ordinary Differential Equations

Hedi Xia, Vai Suliafu, Hangjie Ji and
Tan Nguyen, Andrea Bertozzi, Stanley Osher, Bao Wang

Keywords Paper

deep learning, optimization, machine learning, vision

0

0

0

0

4:08

14/06/2020

HRank: Filter Pruning Using High-Rank Feature Map

Mingbao Lin, Rongrong Ji, Yan Wang and
Yichen Zhang, Baochang Zhang, Yonghong Tian, Ling Shao

Keywords Paper

network pruning, neural network compression and acceleration, high-rank feature map, efficient deep learning computing

0

0

0

0

4:57

06/12/2021

SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search

Qi Chen, Bing Zhao, Haidong Wang and
Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, Jingdong Wang

Keywords Paper

clustering

0

0

0

0

14:54

30/11/2020

To filter prune, or to layer prune, that is the question

Sara Elkerdawy, Mostafa Elhoushi, Abhineet Singh and
Hong Zhang, Nilanjan Ray

Keywords Paper

0

0

0

1

9:32

06/12/2021

Diversity Matters When Learning From Ensembles

Giung Nam, Jongmin Yoon, Yoonho Lee, Juho Lee

Keywords Paper

machine learning, vision

0

0

0

0

2:54

13/04/2021

Associative convolutional layers

Hamed Omidvar, Vahideh Akhlaghi, Hao Su and
Massimo Franceschetti, Rajesh Gupta

Keywords Paper

0

0

0

0

3:09