Neural Network Quantization with Scale-Adjusted Training

Abstract: Quantization has long been studied as a compression and accelerating technique for deep neural networks due to its potential on reducing model size and computational costs, for both general hardware, such as DSP, CPU or GPU, and customized devices with flexible bit-width configurations, including FPGA and ASIC. However, previous works generally achieve network quantization by sacrificing on prediction accuracy with respect to their full-precision counterparts. In this paper, we investigate the underlying mechanism of such performance degeneration based on previous work of parameterized clipping activation (PACT). We find that the key factor is the weight scale in the last layer. Instead of aligning weight distributions of quantized and full-precision models, as generally suggested in the literature, the main issue is that large scale can cause over-fitting problem. We propose a technique called scale-adjusted training (SAT) by directly scaling down weights in the last layer to alleviate such over-fitting. With the proposed technique, quantized networks can demonstrate better performance than their full-precision counter-parts, and we achieve state-of-the-art accuracy with consistent improvement over previous quantization methods for light weight models including MobileNet V1/V2 on ImageNet classification.

22/11/2021

Neural Network Quantization with Scale-Adjusted Training

Qing Jin, Linjie Yang, Zhenyu Liao, Xiaoning Qian

Comments

Similar Papers

Parameter Efficient Dynamic Convolution via Tensor Decomposition

Zejiang Hou, Sun-Yuan Kung

Keywords Abstract Paper

dynamic convolution, input-dependent reparameterization, parameter efficiency, tensor decomposition

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Xiangyu Chang, Yingcong Li, Samet Oymak, Christos Thrampoulidis

Keywords Abstract Paper

UWC: Unit-wise Calibration Towards Rapid Network Compression

Chen Lin, Zheyang Li, Bo Peng and Wenming Tan, Ye Ren, Shiliang Pu

Keywords Abstract Paper

post training quantization

DISCO: accurate Discrete Scale Convolutions

Ivan Sosnovik, Artem Moskalev, Arnold W.M. Smeulders

Keywords Abstract Paper

equivariance, symmetry, invariance, scale, convolutions, dilation, tracking, image classification

Near Lossless Transfer Learning for Spiking Neural Networks

Zhanglu Yan, Jun Zhou, Weng-Fai Wong

Keywords Abstract Paper

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Abstract Paper

Deep Learning - Algorithms

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Abstract Paper

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu

Keywords Abstract Paper

dS^2LBI: Exploring Structural Sparsity on Deep Network via Differential Inclusion Paths

Yanwei Fu, Chen Liu, Donghao Li and Xinwei Sun, Jinshan ZENG, Yuan Yao

Keywords Abstract Paper

Deep Learning - Algorithms

Overcoming Multi-Model Forgetting in One-Shot NAS With Diversity Maximization

Miao Zhang, Huiqi Li, Shirui Pan and Xiaojun Chang, Steven Su

Keywords Abstract Paper

automl, neural architecture search, catastrophic forgetting, novelty search, continual learning

Network-to-Network Regularization: Enforcing Occam's Razor to Improve Generalization

Rohan Ghosh, Mehul Motani

Keywords Abstract Paper

theory, deep learning, machine learning

Decomposable-Net: Scalable Low-Rank Compression for Neural Networks

Atsushi Yaguchi, Taiji Suzuki, Shuhei Nitta and Yukinobu Sakata, Akiyuki Tanizawa

Keywords Abstract Paper

Machine Learning, Deep Learning, Statistical Methods and Machine Learning, Recognition, 2D and 3D Computer Vision

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Abstract Paper

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Abstract Paper

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Itay Hubara, Brian Chmiel, Moshe Island and Ron Banner, Joseph Naor, Daniel Soudry

Keywords Abstract Paper

deep learning

Spike-Thrift: Towards Energy-Efficient Deep Spiking Neural Networks by Limiting Spiking Activity via Attention-Guided Compression

Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

Keywords Abstract Paper

Forward and Backward Information Retention for Accurate Binary Neural Networks

Haotong Qin, Ruihao Gong, Xianglong Liu and Mingzhu Shen, Ziran Wei, Fengwei Yu, Jingkuan Song

Keywords Abstract Paper

model compression, binary neural networks, deep learning, quantization, computer vision

Neural networks with late-phase weights

Johannes von Oswald, Seijin Kobayashi, Joao Sacramento and Alexander Meulemans, Christian Henning, Benjamin F Grewe

Keywords Abstract Paper

Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds

Yujia Huang, Huan Zhang, Yuanyuan Shi and J. Zico Kolter, Anima Anandkumar

Keywords Abstract Paper

deep learning, robustness, adversarial robustness and security

Factorized Higher-Order CNNs With an Application to Spatio-Temporal Emotion Estimation

Jean Kossaifi, Antoine Toisoul, Adrian Bulat and Yannis Panagakis, Timothy M. Hospedales, Maja Pantic

Keywords Abstract Paper

tensor methods, deep learning, spatiotemporal, emotion, cnn, tensor decomposition, low-rank, valence, arousal

Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees

Atsushi Nitanda, Taiji Suzuki

Keywords Abstract Paper

Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians

Keywords Paper

Keywords Paper

Chen Lin, Zheyang Li, Bo Peng and
Wenming Tan, Ye Ren, Shiliang Pu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

Keywords Paper

Yanwei Fu, Chen Liu, Donghao Li and
Xinwei Sun, Jinshan ZENG, Yuan Yao

Keywords Paper

Miao Zhang, Huiqi Li, Shirui Pan and
Xiaojun Chang, Steven Su

Keywords Paper

Keywords Paper

Atsushi Yaguchi, Taiji Suzuki, Shuhei Nitta and
Yukinobu Sakata, Akiyuki Tanizawa

Keywords Paper

Keywords Paper

Keywords Paper

Itay Hubara, Brian Chmiel, Moshe Island and
Ron Banner, Joseph Naor, Daniel Soudry

Keywords Paper

Keywords Paper

Haotong Qin, Ruihao Gong, Xianglong Liu and
Mingzhu Shen, Ziran Wei, Fengwei Yu, Jingkuan Song

Keywords Paper

Johannes von Oswald, Seijin Kobayashi, Joao Sacramento and
Alexander Meulemans, Christian Henning, Benjamin F Grewe

Keywords Paper

Yujia Huang, Huan Zhang, Yuanyuan Shi and
J. Zico Kolter, Anima Anandkumar

Keywords Paper

Jean Kossaifi, Antoine Toisoul, Adrian Bulat and
Yannis Panagakis, Timothy M. Hospedales, Maja Pantic

Keywords Paper

Keywords Paper

Keywords Paper

Fangcheng Fu, Yuzheng Hu, Yihan He and
Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Stefan Uhlich, Lukas Mauch, Fabien Cardinaux and
Kazuki Yoshiyama, Javier Alonso Garcia, Stephen Tiedemann, Thomas Kemp, Akira Nakamura

Keywords Paper

Keywords Paper

Jonathan Schwarz, Siddhant M Jayakumar, Razvan Pascanu and
Peter E Latham, Yee Teh

Keywords Paper

Xiao Zhou, Weizhong Zhang, Zonghao Chen and
SHIZHE DIAO, Tong Zhang

Keywords Paper

Jaehoon Lee, Sam Schoenholz, Jeffrey Pennington and
Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein

Keywords Paper

Keywords Paper

Xiaofeng Ruan, Yufan Liu, Bing Li and
Chunfeng Yuan, Weiming Hu

Keywords Paper

Keywords Paper

Keywords Paper