Compressing pre-trained language models by matrix decomposition

05/12/2020

Compressing pre-trained language models by matrix decomposition

Matan Ben Noach, Yoav Goldberg

Keywords:

Abstract Paper Similar Papers

Abstract: Large pre-trained language models reach state-of-the-art results on many different NLP tasks when fine-tuned individually; They also come with a significant memory and computational requirements, calling for methods to reduce model sizes (green AI). We propose a two-stage model-compression method to reduce a model’s inference time cost. We first decompose the matrices in the model into smaller matrices and then perform feature distillation on the internal representation to recover from the decomposition. This approach has the benefit of reducing the number of parameters while preserving much of the information within the model. We experimented on BERT-base model with the GLUE benchmark dataset and show that we can reduce the number of parameters by a factor of 0.4x, and increase inference speed by a factor of 1.45x, while maintaining a minimal loss in metric performance.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering

Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian

Keywords Paper

Faster Answering, question-independent processing, DeFormer, Decomposing Transformers

0

0

0

0

11:06

06/12/2021

Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices

Aliakbar Panahi, Seyran Saeedi, Tom Arodz

Keywords Paper

transformers

0

0

0

0

13:06

03/05/2021

Learnable Embedding sizes for Recommender Systems

Siyi Liu, Chen Gao, Yihong Chen and
Depeng Jin, Yong Li

Keywords Paper

Deep Learning, Embedding Size, Recommender Systems

0

0

0

0

5:29

06/12/2020

SMYRF - Efficient Attention using Asymmetric Clustering

Giannis Daras, Nikita Kitaev, Augustus Odena, Alex Dimakis

Keywords Paper

0

0

0

0

3:28

14/06/2020

Structured Multi-Hashing for Model Compression

Elad Eban, Yair Movshovitz-Attias, Hao Wu and
Mark Sandler, Andrew Poon, Yerlan Idelbayev, Miguel Á. Carreira-Perpiñán

Keywords Paper

compression, weight hashing, on device

0

0

0

0

1:01

18/07/2021

Efficient Iterative Amortized Inference for Learning Symmetric and Disentangled Multi-Object Representations

Patrick Emami, Pan He, Sanjay Ranka, Anand Rangarajan

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

5:10

23/08/2020

Compositional embeddings using complementary partitions for memory-efficient recommendation systems

Hao-Jun Michael Shi, Dheevatsa Mudigere, Maxim Naumov, Jiyan Yang

Keywords Paper

embeddings, model compression, recommendation systems

0

0

0

0

16:14

26/04/2020

Minimizing FLOPs to Learn Efficient Sparse Representations

Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen and
Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Keywords Paper

sparse embeddings, deep representations, metric learning, regularization

0

0

0

0

4:41

03/05/2021

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

Beliz Gunel, Jingfei Du, Alexis Conneau, Veselin Stoyanov

Keywords Paper

supervised contrastive learning, pre-trained language model fine-tuning, natural language understanding, generalization, few-shot learning, robustness

0

0

0

0

4:44

03/05/2021

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Yanru Qu, Dinghan Shen, Yelong Shen and
Sandra Sajeev, Weizhu Chen, Jiawei Han

Keywords Paper

consistency training, contrastive learning, data augmentation, natural language understanding

0

0

0

0

6:02

06/12/2021

Efficient Combination of Rematerialization and Offloading for Training DNNs

Olivier Beaumont, Lionel Eyraud-Dubois, Alena Shilova

Keywords Paper

deep learning, optimization

0

0

0

0

15:03

06/12/2021

Breaking the Linear Iteration Cost Barrier for Some Well-known Conditional Gradient Methods Using MaxIP Data-structures

Zhaozhuo Xu, Zhao Song, Anshumali Shrivastava

Keywords Paper

optimization, machine learning

0

0

0

0

12:13

06/12/2020

Modular Meta-Learning with Shrinkage

Yutian Chen, Abe Friesen, Feryal Behbahani and
Arnaud Doucet, David Budden, Matthew Hoffman, Nando de Freitas

Keywords Paper

0

0

0

0

3:21

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

16/11/2020

Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for Low-Latency Inference in NLP Applications

Matthew Khoury, Rumen Dangovski, Longwu Ou and
Preslav Nakov, Yichen Shen, Li Jing

Keywords Paper

natural applications, neural translation, neural nmt, neural

0

0

0

0

11:54

26/04/2020

DeFINE: Deep Factorized Input Token Embeddings for Neural Sequence Modeling

Sachin Mehta, Rik Koncel-Kedziorski, Mohammad Rastegari, Hannaneh Hajishirzi

Keywords Paper

sequence modeling, input representations, language modeling, word embedding

0

0

0

0

4:50

03/08/2020

Improved Vector Pruning in Exact Algorithms for Solving POMDPs

Eric Hansen, Thomas Bowman

Keywords Paper

0

0

0

0

7:36

19/08/2021

Decomposable-Net: Scalable Low-Rank Compression for Neural Networks

Atsushi Yaguchi, Taiji Suzuki, Shuhei Nitta and
Yukinobu Sakata, Akiyuki Tanizawa

Keywords Paper

Machine Learning, Deep Learning, Statistical Methods and Machine Learning, Recognition, 2D and 3D Computer Vision

0

0

0

0

10:40

26/08/2020

On casting importance weighted autoencoder to an EM algorithm to learn deep generative models

Dongha Kim, Jaesung Hwang, Yongdai Kim

Keywords Paper

0

0

0

0

14:06

06/12/2021

Low-Rank Constraints for Fast Inference in Structured Models

Justin Chiu, Yuntian Deng, Alexander Rush

Keywords Paper

generative model, graph learning

0

0

0

0

13:38

06/12/2021

Searching for Efficient Transformers for Language Modeling

David So, Wojciech Mańke, Hanxiao Liu and
Zihang Dai, Noam Shazeer, Quoc V Le

Keywords Paper

transformers, language

0

0

0

0

13:29

30/11/2020

Horizontal Flipping Assisted Disentangled Feature Learning for Semi-Supervised Person Re-Identification

Gehan Hao, Yang Yang, Xue Zhou and
Guanan Wang, Zhen Lei

Keywords Paper

0

0

0

0

5:09

18/07/2021

Exact Optimization of Conformal Predictors via Incremental and Decremental Learning

Giovanni Cherubin, Konstantinos Chatzikokolakis, Martin Jaggi

Keywords Paper

Probabilistic Methods

0

0

0

0

5:48

06/12/2021

Compacter: Efficient Low-Rank Hypercomplex Adapter Layers

Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder

Keywords Paper

optimization

0

0

0

0

14:16

04/07/2020

Efficient Contextual Representation Learning With Continuous Outputs

Liunian Harold Li, Patrick H. Chen, Cho-Jui Hsieh, Kai-Wei Chang

Keywords Paper

natural tasks, Contextual Learning, Contextual models, language-model-based encoders

0

0

0

0

11:51

06/12/2020

GCN meets GPU: Decoupling “When to Sample” from “How to Sample”

Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi and
Anand Sivasubramaniam, Mahmut Kandemir

Keywords Paper

0

0

0

0

3:24

05/04/2021

Boveda: Building an On-Chip Deep Learning Memory Hierarchy Brick by Brick

Isak Edo Vivancos, Sayeh Sharify, Daniel Ly-Ma and
Ameer Abdelhadi, Ciaran Bannon, Milos Nikolic, Mostafa Mahmoud, Alberto Delmas Lascorz, Gennady Pekhimenko, Andreas Moshovos

Keywords Paper

0

0

0

0

5:15

05/04/2021

Boveda: Building an On-Chip Deep Learning Memory Hierarchy Brick by Brick

Isak Edo Vivancos, Sayeh Sharify, Daniel Ly-Ma and
Ameer Abdelhadi, Ciaran Bannon, Milos Nikolic, Mostafa Mahmoud, Alberto Delmas Lascorz, Gennady Pekhimenko, Andreas Moshovos

Keywords Paper

0

0

0

0

19:54

02/02/2021

TRQ: Ternary Neural Networks With Residual Quantization

Yue Li, Wenrui Ding, Chunlei Liu and
Baochang Zhang, Guodong Guo

Keywords Paper

0

0

0

0

15:21

26/04/2020

Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation

Xinjie Fan, Yizhe Zhang, Zhendong Wang, Mingyuan Zhou

Keywords Paper

binary softmax, discrete variables, policy gradient, pseudo actions, reinforcement learning, variance reduction

0

0

0

0

4:59

06/12/2020

All Word Embeddings from One Embedding

Sho Takase, Sosuke Kobayashi

Keywords Paper

0

0

0

0

3:11

12/07/2020

Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding

Yibo Yang, Robert Bamler, Stephan Mandt

Keywords Paper

Deep Learning - General

0

0

0

0

15:08

16/11/2020

Compressive Summarization with Plausibility and Salience Modeling

Shrey Desai, Jiacheng Xu, Greg Durrett

Keywords Paper

compressive systems, compressions, rouge, pre-trained model

0

0

0

0

12:04

02/02/2021

Adaptive Beam Search Decoding for Discrete Keyphrase Generation

Xiaoli Huang, Tongge Xu, Lvan Jiao and
Yueran Zu, Youmin Zhang

Keywords Paper

0

0

0

0

14:36

16/11/2020

Masking as an Efficient Alternative to Finetuning for Pretrained Language Models

Mengjie Zhao, Tao Lin, Fei Mi and
Martin Jaggi, Hinrich Schütze

Keywords Paper

masking bert, nlp tasks, downstream tasks, masking

0

0

0

0

12:40

18/07/2021

ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision

Wonjae Kim, Bokyung Son, Ildoo Kim

Keywords Paper

Algorithms, Multimodal Learning

0

0

0

0

19:03

26/04/2020

Improving Neural Language Generation with Spectrum Control

Lingxiao Wang, Jing Huang, Kevin Huang and
Ziniu Hu, Guangtao Wang, Quanquan Gu

Keywords Paper

0

0

0

0

4:58

03/05/2021

Auxiliary Task Update Decomposition: The Good, the Bad and the Neutral

Lucio Dery, Yann Dauphin, David Grangier

Keywords Paper

multitask learning, deeplearning, pre-training, gradient decomposition

0

0

0

0

5:22

05/04/2021

Doping: A technique for Extreme Compression of LSTM Models using Sparse Structured Additive Matrices

Urmish Thakker, Paul Whatmough, ZHIGANG LIU and
Matthew Mattina, Jesse Beu

Keywords Paper

0

0

0

0

19:07

05/04/2021

Doping: A technique for Extreme Compression of LSTM Models using Sparse Structured Additive Matrices

Urmish Thakker, Paul Whatmough, ZHIGANG LIU and
Matthew Mattina, Jesse Beu

Keywords Paper

0

0

0

0

4:22