Deconstructing word embedding algorithms

16/11/2020

Deconstructing word embedding algorithms

Kian Kenyon-Dean, Edward Newell, Jackie Chi Kit Cheung

Keywords: nlp applications, nlp tasks, word embeddings, feature words

Abstract Paper Similar Papers

Abstract: Word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Uncontextualized word embeddings are used in many NLP tasks today, especially in resource-limited settings where high memory capacity and GPUs are not available. Given the historical success of word embeddings in NLP, we propose a retrospective on some of the most well-known word embedding algorithms. In this work, we deconstruct Word2vec, GloVe, and others, into a common form, unveiling some of the common conditions that seem to be required for making performant word embeddings. We believe that the theoretical findings in this paper can provide a basis for more informed development of future models.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Meta-Transfer Learning for Low-Resource Abstractive Summarization

Yi-Syuan Chen, Hong-Han Shuai

Keywords Paper

0

0

0

0

19:10

04/07/2020

Learning to Tag OOV Tokens by Integrating Contextual Representation and Background Knowledge

Keqing He, Yuanmeng Yan, Weiran XU

Keywords Paper

slot tagging, Contextual Representation, Neural-based models, knowledge-enhanced model

0

0

0

0

6:05

16/11/2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

Keywords Paper

nlp applications, fine-tuning, meta-learning problem, supervised tasks

0

0

0

0

11:49

26/04/2020

Learning Robust Representations via Multi-View Information Bottleneck

Marco Federici, Anjan Dutta, Patrick Forré and
Nate Kushman, Zeynep Akata

Keywords Paper

Information Bottleneck, Multi-View Learning, Representation Learning, Information Theory

0

0

0

0

4:56

04/07/2020

Contextualized Weak Supervision for Text Classification

Dheeraj Mekala, Jingbo Shang

Keywords Paper

Text Classification, Weakly classification, string matching, Contextualized Supervision

0

0

0

0

11:26

06/12/2021

STEP: Out-of-Distribution Detection in the Presence of Limited In-Distribution Labeled Data

Zhi Zhou, Lan-Zhe Guo, Zhanzhan Cheng and
Yu-Feng Li, Shiliang Pu

Keywords Paper

optimization, semi-supervised learning

0

0

0

0

11:24

04/07/2020

A Transformer-based Approach for Source Code Summarization

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Keywords Paper

Source Summarization, summarization, ablation studies, Transformer-based Approach

0

0

0

0

6:14

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

03/05/2021

Neural Topic Model via Optimal Transport

He Zhao, Dinh Phung, Viet Huynh and
Trung Le, Wray Buntine

Keywords Paper

optimal transport, document analysis, topic modelling

0

0

0

1

9:29

16/11/2020

Grounded Compositional Outputs for Adaptive Language Modeling

Nikolaos Pappas, Phoebe Mulcaire, Noah A. Smith

Keywords Paper

language modeling, cross-domain settings, language models, finetuning

0

0

0

0

10:28

26/04/2020

A Probabilistic Formulation of Unsupervised Text Style Transfer

Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick

Keywords Paper

unsupervised text style transfer, deep latent sequence model

0

0

0

0

5:02

06/12/2020

Incorporating BERT into Parallel Sequence Decoding with Adapters

Junliang Guo, Zhirui Zhang, Linli Xu and
Hao-Ran Wei, Boxing Chen, Enhong Chen

Keywords Paper

0

0

0

0

3:17

22/11/2021

From Seq2Seq Recognition to Handwritten Word Embeddings

George Retsinas, Giorgos Sfikas, Christophoros Nikou, Petros Maragos

Keywords Paper

keyword spotting, handwritten text recognition, sequence-to-sequence

0

0

0

0

2:59

08/12/2020

Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

Ankit Arun, Soumya Batra, Vikas Bhardwaj and
Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

Keywords Paper

0

0

0

0

15:01

16/11/2020

Short Text Topic Modeling with Topic Distribution Quantization and Negative Sampling Decoder

Xiaobao Wu, Chunping Li, Yan Zhu, Yishu Miao

Keywords Paper

decoding, short modeling, topic models, neural model

0

0

0

0

10:30

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

16/11/2020

Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

Keywords Paper

document-level translation, document-level systems, context-aware architecture, transformer

0

0

0

0

6:36

04/07/2020

SenseBERT: Driving Some Sense into BERT

Yoav Levine, Barak Lenz, Or Dagan and
Ori Ram, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, Yoav Shoham

Keywords Paper

natural understanding, lexical understanding, SemEval Disambiguation, task

0

0

0

0

10:53

16/11/2020

Coreferential Reasoning Learning for Language Representation

Deming Ye, Yankai Lin, Jiaju Du and
Zhenghao Liu, Peng Li, Maosong Sun, Zhiyuan Liu

Keywords Paper

downstream tasks, coreferential reasoning, common tasks, language models

0

0

0

0

7:30

16/11/2020

Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss”

Michele Bevilacqua, Marco Maru, Roberto Navigli

Keywords Paper

generative modeling, definition modeling, discriminative tasks, word disambiguation

0

0

0

0

11:49

19/04/2021

Zero-shot neural passage retrieval via domain-targeted synthetic question generation

Ji Ma, Ivan Korotkov, Yinfei Yang and
Keith Hall, Ryan McDonald

Keywords Paper

0

0

0

0

12:47

14/06/2020

On Vocabulary Reliance in Scene Text Recognition

Zhaoyi Wan, Jielei Zhang, Liang Zhang and
Jiebo Luo, Cong Yao

Keywords Paper

scene text recognition, text spotting, document analysis, ocr, scene text detection, sequence recognition, language and vision

0

0

0

0

1:00

19/08/2021

ALaSca: an Automated approach for Large-Scale Lexical Substitution

Caterina Lacerra, Tommaso Pasini, Rocco Tripodi, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:27

05/12/2020

Knowledge-enhanced named entity disambiguation for short text

Zhifan Feng, Qi Wang, Wenbin Jiang and
Yajuan Lyu, Yong Zhu

Keywords Paper

0

0

0

0

14:40

07/06/2020

Learning Cross-Lingual Word Embeddings from Twitter via Distant Supervision

Jose Camacho-Collados, Yerai Doval Mosquera, Eugenio Martínez-Cámara and
Luis Espinosa-Anke, Francesco Barbieri, Steven Schockaert

Keywords Paper

embedding spaces, embeddings, languages, learning, performance, representations, shared, spaces, texts, twitter, word embeddings, words

0

0

0

0

10:39

16/11/2020

Asking without Telling: Exploring Latent Ontologies in Contextual Representations

Julian Michael, Jan A. Botha, Ian Tenney

Keywords Paper

pretrained encoders, elmo, bert, latent learning

0

0

0

0

12:45

02/02/2021

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

Ye Liu, Yao Wan, Lifang He and
Hao Peng, Philip S. Yu

Keywords Paper

0

0

0

0

17:52

06/12/2021

On Calibration and Out-of-Domain Generalization

Yoav Wald, Amir Feder, Daniel Greenfeld, Uri Shalit

Keywords Paper

machine learning, domain adaptation, causality

0

0

0

0

11:00

19/08/2021

Guided Attention Network for Concept Extraction

Songtao Fang, Zhenya Huang, Ming He and
Shiwei Tong, Xiaoqing Huang, Ye Liu, Jie Huang, Qi Liu

Keywords Paper

Data Mining, Information Retrieval, Mining Text, Web, Social Media

0

0

0

0

14:26

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

19/04/2021

WiC-TSV: An evaluation benchmark for target sense verification of words in context

Anna Breit, Artem Revenko, Kiamehr Rezaee and
Mohammad Taher Pilehvar, Jose Camacho-Collados

Keywords Paper

0

0

0

0

9:54

03/05/2021

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Boxin Wang, Shuohang Wang, Yu Cheng and
Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Keywords Paper

adversarial training, QA, NLI, BERT, information theory, adversarial robustness

0

0

0

0

5:21

19/10/2020

Enhance prototypical network with text descriptions for few-shot relation classification

Kaijia Yang, Nantao Zheng, Xinyu Dai and
Liang He, Shujian Huang, Jiajun Chen

Keywords Paper

text description, relation extraction, few shot

0

0

0

0

6:55

22/11/2021

Mode-Guided Feature Augmentation for Domain Generalization

Muhammad Haris Khan, Syed Muhammad talha Zaidi, Salman Khan, Fahad Shahbaz Khan

Keywords Paper

out-of-domain robustness, domain generalization, domain adaptation, convolutional neural networks, data augmentation, feature augmentation, subspace similarity, covariate shift, in-domain generalization, robust objective function

0

0

0

0

2:56

04/07/2020

Considering Likelihood in NLP Classiﬁcation Explanations with Occlusion and Language Modeling

David Harbecke, Christoph Alt

Keywords Paper

NLP, NLP Explanations, Language Modeling, NLP models

0

0

0

0

12:01

04/07/2020

Paraphrase Generation by Learning How to Edit from Samples

Amirhossein Kazemnejad, Mohammadreza Salehi, Mahdieh Soleymani Baghshah

Keywords Paper

Paraphrase Generation, Neural sequence, sequence generation, retrieval-based method

0

0

0

0

12:20

16/11/2020

Towards Better Context-aware Lexical Semantics:Adjusting Contextualized Representations through Static Anchors

Qianchu Liu, Diana McCarthy, Anna Korhonen

Keywords Paper

transformation, contextualized models, dynamic embeddings, post-processing technique

0

0

0

0

6:53

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

05/12/2020

Towards non-task-specific distillation of BERT via sentence representation approximation

Bowen Wu, Huan Zhang, MengYuan Li and
Zongsheng Wang, Qihang Feng, Junhong Huang, Baoxun Wang

Keywords Paper

0

0

0

0

10:51

06/12/2020

Unsupervised Data Augmentation for Consistency Training

Qizhe Xie, Zihang Dai, Eduard Hovy and
Thang Luong, Quoc V Le

Keywords Paper

0

0

0

0

3:29