Adaptive Compression of Word Embeddings

04/07/2020

Adaptive Compression of Word Embeddings

Yeachan Kim, Kang-Min Kim, SangKeun Lee

Keywords: Adaptive Embeddings, Distributed words, natural tasks, downstream tasks

Abstract Paper Similar Papers

Abstract: Distributed representations of words have been an indispensable component for natural language processing (NLP) tasks. However, the large memory footprint of word embeddings makes it challenging to deploy NLP models to memory-constrained devices (e.g., self-driving cars, mobile devices). In this paper, we propose a novel method to adaptively compress word embeddings. We fundamentally follow a code-book approach that represents words as discrete codes such as (8, 5, 2, 4). However, unlike prior works that assign the same length of codes to all words, we adaptively assign different lengths of codes to each word by learning downstream tasks. The proposed method works in two steps. First, each word directly learns to select its code length in an end-to-end manner by applying the Gumbel-softmax tricks. After selecting the code length, each word learns discrete codes through a neural network with a binary constraint. To showcase the general applicability of the proposed method, we evaluate the performance on four different downstream tasks. Comprehensive evaluation results clearly show that our method is effective and makes the highly compressed word embeddings without hurting the task accuracy. Moreover, we show that our model assigns word to each code-book by considering the significance of tasks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

08/12/2020

CharBERT: Character-aware Pre-trained Language Model

Wentao Ma, Yiming Cui, Chenglei Si and
Ting Liu, Shijin Wang, Guoping Hu

Keywords Paper

0

0

0

0

14:20

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

16/11/2020

Train No Evil: Selective Masking for Task-Guided Pre-Training

Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang and
Zhiyuan Liu, Maosong Sun

Keywords Paper

pre-training stage, fine-tuning stage, general pre-training, sentiment tasks

0

0

0

0

7:02

12/07/2020

PoKED: A Semi-Supervised System for Word Sense Disambiguation

Feng Wei

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

15:39

06/12/2020

All Word Embeddings from One Embedding

Sho Takase, Sosuke Kobayashi

Keywords Paper

0

0

0

0

3:11

16/11/2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

Keywords Paper

nlp applications, fine-tuning, meta-learning problem, supervised tasks

0

0

0

0

11:49

06/12/2021

Sequence-to-Sequence Learning with Latent Neural Grammars

Yoon Kim

Keywords Paper

deep learning

0

0

0

0

14:31

02/02/2021

Adaptive Beam Search Decoding for Discrete Keyphrase Generation

Xiaoli Huang, Tongge Xu, Lvan Jiao and
Yueran Zu, Youmin Zhang

Keywords Paper

0

0

0

0

14:36

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

06/12/2021

Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation

Yufei Wang, Can Xu, Huang Hu and
Chongyang Tao, Stephen Wan, Mark Dras, Mark Johnson, Daxin Jiang

Keywords Paper

transformers

0

0

0

0

10:13

02/02/2021

Show Me How To Revise: Improving Lexically Constrained Sentence Generation with XLNet

Xingwei He, Victor O.K. Li

Keywords Paper

0

0

0

0

16:47

19/04/2021

Generating syntactically controlled paraphrases without using annotated parallel pairs

Kuan-Hao Huang, Kai-Wei Chang

Keywords Paper

0

0

0

1

10:41

06/12/2021

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Jixuan Wang, Kuan-Chieh Wang, Frank Rudzicz, Michael Brudno

Keywords Paper

machine learning, transformers, meta learning, language, transfer learning

0

0

0

0

14:45

14/06/2020

TransMatch: A Transfer-Learning Scheme for Semi-Supervised Few-Shot Learning

Zhongjie Yu, Lin Chen, Zhongwei Cheng, Jiebo Luo

Keywords Paper

few-shot learning, semi-supervised learning, meta-learning

0

0

0

0

1:01

06/12/2020

Learning Sparse Prototypes for Text Generation

Junxian He, Taylor Berg-Kirkpatrick, Graham Neubig

Keywords Paper

0

0

0

0

3:22

02/02/2021

LRSC: Learning Representations for Subspace Clustering

Changsheng Li, Chen Yang, Bo Liu and
Ye Yuan, Guoren Wang

Keywords Paper

0

0

0

0

15:09

04/07/2020

Using Context in Neural Machine Translation Training Objectives

Danielle Saunders, Felix Stahlberg, Bill Byrne

Keywords Paper

Neural training, NMT training, document-level training, NMT objective

0

0

0

0

6:48

02/02/2021

Decision-Guided Weighted Automata Extraction from Recurrent Neural Networks

Xiyue Zhang, Xiaoning Du, Xiaofei Xie and
Lei Ma, Yang Liu, Meng Sun

Keywords Paper

0

0

0

0

16:44

19/04/2021

Progressively pretrained dense corpus index for open-domain question answering

Wenhan Xiong, Hong Wang, William Yang Wang

Keywords Paper

0

0

0

0

12:15

16/11/2020

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

Zehui Lin, Xiao Pan, Mingxuan Wang and
Xipeng Qiu, Jiangtao Feng, Hao Zhou, Lei Li

Keywords Paper

machine mt, mt, rich mt, universal model

0

0

0

0

12:00

19/08/2021

A Sequence-to-Set Network for Nested Named Entity Recognition

Zeqi Tan, Yongliang Shen, Shuai Zhang and
Weiming Lu, Yueting Zhuang

Keywords Paper

Natural Language Processing, Information Extraction, Named Entities

0

0

0

0

10:38

26/04/2020

Decoupling Representation and Classifier for Long-Tailed Recognition

Bingyi Kang, Saining Xie, Marcus Rohrbach and
Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

Keywords Paper

long-tailed recognition, classification

0

0

0

1

5:00

04/07/2020

Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling

Zihan Liu, Genta Indra Winata, Peng Xu, Pascale Fung

Keywords Paper

Cross-domain Filling, task-oriented systems, slot filling, data problem

0

0

0

0

6:59

03/05/2021

IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning

Manli Zhang, Jianhong Zhang, Zhiwu Lu and
Tao Xiang, Mingyu Ding, Songfang Huang

Keywords Paper

self-supervised learning, few-shot learning, episode-level pretext task

0

0

0

0

5:03

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

08/12/2020

Transliteration of Judeo-Arabic Texts into Arabic Script Using Recurrent Neural Networks

Ori Terner, Kfir Bar, Nachum Dershowitz

Keywords Paper

0

0

0

0

9:50

26/04/2020

word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement

Aliakbar Panahi, Seyran Saeedi, Tom Arodz

Keywords Paper

word embeddings, natural language processing, model reduction

0

0

0

0

4:28

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

06/12/2021

A Framework to Learn with Interpretation

Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Keywords Paper

deep learning, interpretability

0

0

0

0

14:05

06/12/2020

Incorporating BERT into Parallel Sequence Decoding with Adapters

Junliang Guo, Zhirui Zhang, Linli Xu and
Hao-Ran Wei, Boxing Chen, Enhong Chen

Keywords Paper

0

0

0

0

3:17

06/12/2020

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Lu Hou, Zhiqi Huang, Lifeng Shang and
Xin Jiang, Xiao Chen, Qun Liu

Keywords Paper

0

0

0

0

2:59

12/07/2020

Sparse Sinkhorn Attention

Yi Tay, Dara Bahri, Liu Yang and
Don Metzler, Da-Cheng Juan

Keywords Paper

Deep Learning - Algorithms

0

0

0

1

12:12

19/04/2021

Retrieval, re-ranking and multi-task learning for knowledge-base question answering

Zhiguo Wang, Patrick Ng, Ramesh Nallapati, Bing Xiang

Keywords Paper

0

0

0

0

11:12

16/11/2020

POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training

Yizhe Zhang, Guoyin Wang, Chunyuan Li and
Zhe Gan, Chris Brockett, Bill Dolan

Keywords Paper

language learning, free-form generation, hard-constrained generation, hard-constrained tasks

0

0

0

0

10:09

02/02/2021

LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding

Hao Fu, Shaojun Zhou, Qihong Yang and
Junjie Tang, Guiquan Liu, Kaikui Liu, Xiaolong Li

Keywords Paper

0

0

0

0

15:25

04/07/2020

Tabula nearly Rasa: Probing the linguistic knowledge of character-level neural language models trained on unsegmented text

Michael Hahn, Marco Baroni

Keywords Paper

natural tasks, morphological tasks, language usage, Tabula

0

0

0

0

14:40

04/07/2020

Exploiting the Syntax-Model Consistency for Neural Relation Extraction

Amir Pouran Ben Veyseh, Franck Dernoncourt, Dejing Dou, Thien Huu Nguyen

Keywords Paper

Neural Extraction, Relation Extraction, RE, syntactic injection

0

0

0

0

11:03

04/07/2020

Efficient Dialogue State Tracking by Selectively Overwriting Memory

Sungdong Kim, Sohee Yang, Gyuwan Kim, Sang-Woo Lee

Keywords Paper

Dialogue Tracking, predicting operation, training, open setting

0

0

0

0

11:12

12/07/2020

Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding

Yibo Yang, Robert Bamler, Stephan Mandt

Keywords Paper

Deep Learning - General

0

0

0

0

15:08

19/08/2021

Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention

Wei Suo, MengYang Sun, Peng Wang, Qi Wu

Keywords Paper

Computer Vision, Language and Vision, Structural and Model-Based Approaches, Knowledge Representation and Reasoning

0

0

0

0

17:31