Automatic Mixed-Precision Quantization Search of BERT

19/08/2021

Automatic Mixed-Precision Quantization Search of BERT

Changsheng Zhao, Ting Hua, Yilin Shen, Qian Lou, Hongxia Jin

Keywords: Machine Learning, Deep Learning, NLP Applications and Tools, Text Classification

Abstract Paper Similar Papers

Abstract: Pre-trained language models such as BERT have shown remarkable effectiveness in various natural language processing tasks. However, these models usually contain millions of parameters, which prevent them from the practical deployment on resource-constrained devices. Knowledge distillation, Weight pruning, and Quantization are known to be the main directions in model compression. However, compact models obtained through knowledge distillation may suffer from significant accuracy drop even for a relatively small compression ratio. On the other hand, there are only a few attempts based on quantization designed for natural language processing tasks, and they usually require manual setting on hyper-parameters. In this paper, we proposed an automatic mixed-precision quantization framework designed for BERT that can conduct quantization and pruning simultaneously. Specifically, our proposed method leverages Differentiable Neural Architecture Search to assign scale and precision for parameters in each sub-group automatically, and at the same pruning out redundant groups of parameters. Extensive evaluations on BERT downstream tasks reveal that our proposed method beats baselines by providing the same performance with much smaller model size. We also show the possibility of obtaining the extremely light-weight model by combining our solution with orthogonal methods such as DistilBERT.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at IJCAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

GAN-BERT: Generative Adversarial Learning for Robust Text Classification with a Bunch of Labeled Examples

Danilo Croce, Giuseppe Castellucci, Roberto Basili

Keywords Paper

Robust Classification, Natural tasks, image processing, generative setting

0

0

0

0

6:48

05/12/2020

Towards non-task-specific distillation of BERT via sentence representation approximation

Bowen Wu, Huan Zhang, MengYuan Li and
Zongsheng Wang, Qihang Feng, Junhong Huang, Baoxun Wang

Keywords Paper

0

0

0

0

10:51

06/12/2020

Incorporating BERT into Parallel Sequence Decoding with Adapters

Junliang Guo, Zhirui Zhang, Linli Xu and
Hao-Ran Wei, Boxing Chen, Enhong Chen

Keywords Paper

0

0

0

0

3:17

01/07/2020

Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT

Ashutosh Adhikari, Achyudh Ram, Raphael Tang and
William L. Hamilton, Jimmy Lin

Keywords Paper

0

0

0

0

4:55

02/02/2021

ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques

Yuanxin Liu, Zheng Lin, Fengcheng Yuan

Keywords Paper

0

0

0

0

15:40

19/04/2021

Retrieval, re-ranking and multi-task learning for knowledge-base question answering

Zhiguo Wang, Patrick Ng, Ramesh Nallapati, Bing Xiang

Keywords Paper

0

0

0

0

11:12

06/12/2020

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Lu Hou, Zhiqi Huang, Lifeng Shang and
Xin Jiang, Xiao Chen, Qun Liu

Keywords Paper

0

0

0

0

2:59

26/08/2020

Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Yuxuan Song, Ning Miao, Hao Zhou and
Lantao Yu, Mingxuan Wang, Lei Li

Keywords Paper

0

0

0

0

12:32

16/11/2020

TernaryBERT: Distillation-aware Ultra-low Bit BERT

Wei Zhang, Lu Hou, Yichun Yin and
Lifeng Shang, Xiao Chen, Xin Jiang, Qun Liu

Keywords Paper

natural tasks, training process, transformer-based models, bert

0

0

0

0

8:41

03/05/2021

Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies

Paul Pu Liang, Manzil Zaheer, Yuan Wang, Amr Ahmed

Keywords Paper

text classification, recommendation systems, large vocabularies, sparse embeddings, language modeling

0

0

0

1

7:03

18/07/2021

LogME: Practical Assessment of Pre-trained Models for Transfer Learning

Kaichao You, Yong Liu, Jianmin Wang, Mingsheng Long

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

1

0

0

5:18

06/12/2021

Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices

Aliakbar Panahi, Seyran Saeedi, Tom Arodz

Keywords Paper

transformers

0

0

0

0

13:06

06/12/2020

ConvBERT: Improving BERT with Span-based Dynamic Convolution

Zi-Hang Jiang, Weihao Yu, Daquan Zhou and
Yunpeng Chen, Jiashi Feng, Shuicheng Yan

Keywords Paper

0

0

0

0

3:20

06/12/2021

Compacter: Efficient Low-Rank Hypercomplex Adapter Layers

Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder

Keywords Paper

optimization

0

0

0

0

14:16

06/12/2020

MPNet: Masked and Permuted Pre-training for Language Understanding

Kaitao Song, Xu Tan, Tao Qin and
Jianfeng Lu, Tie-Yan Liu

Keywords Paper

0

0

0

0

3:23

19/08/2021

A Sequence-to-Set Network for Nested Named Entity Recognition

Zeqi Tan, Yongliang Shen, Shuai Zhang and
Weiming Lu, Yueting Zhuang

Keywords Paper

Natural Language Processing, Information Extraction, Named Entities

0

0

0

0

10:38

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

08/12/2020

Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

Ankit Arun, Soumya Batra, Vikas Bhardwaj and
Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

Keywords Paper

0

0

0

0

15:01

16/11/2020

An Unsupervised Sentence Embedding Method by Mutual Information Maximization

Yan Zhang, Ruidan He, Zuozhu Liu and
Kwan Hui Lim, Lidong Bing

Keywords Paper

sentence-pair tasks, clustering, semantic search, downstream tasks

0

0

0

0

12:22

04/07/2020

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Ji Xin, Raphael Tang, Jaejun Lee and
Yaoliang Yu, Jimmy Lin

Keywords Paper

Accelerating Inference, NLP applications, inference, real-time applications

0

0

0

0

6:56

02/02/2021

Token-Aware Virtual Adversarial Training in Natural Language Understanding

Linyang Li, Xipeng Qiu

Keywords Paper

0

0

0

0

12:49

26/04/2020

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

energy-based models, text generation

0

0

0

0

4:59

03/05/2021

On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines

Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow

Keywords Paper

BERT, transfer learning, pretrained language model, fine-tuning stability

0

0

0

0

3:01

04/07/2020

Improved Natural Language Generation via Loss Truncation

Daniel Kang, Tatsunori Hashimoto

Keywords Paper

Natural Generation, optimization, estimation, distinguishability

0

0

0

0

10:35

16/11/2020

Train No Evil: Selective Masking for Task-Guided Pre-Training

Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang and
Zhiyuan Liu, Maosong Sun

Keywords Paper

pre-training stage, fine-tuning stage, general pre-training, sentiment tasks

0

0

0

0

7:02

02/02/2021

Improving the Efficiency and Effectiveness for BERT-based Entity Resolution

Bing Li, Yukai Miao, Yaoshu Wang and
Yifang Sun, Wei Wang

Keywords Paper

0

1

0

0

14:53

16/11/2020

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Linyang Li, Ruotian Ma, Qipeng Guo and
Xiangyang Xue, Xipeng Qiu

Keywords Paper

adversarial attacks, downstream tasks, calculation, gradient-based methods

0

0

0

0

11:36

06/12/2021

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Jixuan Wang, Kuan-Chieh Wang, Frank Rudzicz, Michael Brudno

Keywords Paper

machine learning, transformers, meta learning, language, transfer learning

0

0

0

0

14:45

04/07/2020

Handling Rare Entities for Neural Sequence Labeling

Yangming Li, Han Li, Kaisheng Yao, Xiaolong Li

Keywords Paper

Neural Labeling, data problem, delexicalized identification, local reconstruction

0

0

0

0

15:45

06/12/2021

NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM

Connor Holmes, Minjia Zhang, Yuxiong He, Bo Wu

Keywords Paper

optimization, transformers, language

0

0

0

0

10:53

04/07/2020

Syntactic Data Augmentation Increases Robustness to Inference Heuristics

Junghyun Min, R. Thomas McCoy, Dipanjan Das and
Emily Pitler, Tal Linzen

Keywords Paper

Syntactic Augmentation, natural inference, natural NLI, NLI

0

0

0

0

6:59

16/11/2020

Syntactic Structure Distillation Pretraining for Bidirectional Encoders

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and
Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Paper

bert pretraining, structured tasks, natural understanding, textual learners

0

0

0

0

12:23

26/08/2020

How fine can fine-tuning be? Learning efficient language models

Evani Radiya-Dixit, Xin Wang

Keywords Paper

0

0

0

0

13:05

18/07/2021

Generalization Guarantees for Neural Architecture Search with Train-Validation Split

Samet Oymak, Mingchen Li, Mahdi Soltanolkotabi

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:16

22/06/2020

Knowledge Graph Embedding Compression

Mrinmaya Sachan

Keywords Paper

0

0

0

0

5:03

04/07/2020

Knowledge Graph Embedding Compression

Mrinmaya Sachan

Keywords Paper

AI applications, reasoning tasks, KG inference, Knowledge Compression

0

0

0

0

11:18

06/12/2020

Semi-Supervised Neural Architecture Search

Renqian Luo, Xu Tan, Rui Wang and
Tao Qin, Enhong Chen, Tie-Yan Liu

Keywords Paper

0

0

0

0

3:20

04/07/2020

The Right Tool for the Job: Matching Model and Instance Complexities

Roy Schwartz, Gabriel Stanovsky, Swabha Swayamdipta and
Jesse Dodge, Noah A. Smith

Keywords Paper

inference, early decisions, costly retraining, Job Model

0

0

0

0

11:27

06/12/2021

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition

Cheng-I Jeff Lai, Yang Zhang, Alexander Liu and
Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David Cox, Jim Glass

Keywords Paper

self-supervised learning, representation learning

0

0

0

0

13:57

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27