BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance

Abstract: Pretraining deep language models has led to large performance gains in NLP. Despite this success, Schick and Schütze (2020) recently showed that these models struggle to understand rare words. For static word embeddings, this problem has been addressed by separately learning representations for rare words. In this work, we transfer this idea to pretrained language models: We introduce BERTRAM, a powerful architecture based on BERT that is capable of inferring high-quality embeddings for rare words that are suitable as input representations for deep language models. This is achieved by enabling the surface form and contexts of a word to interact with each other in a deep architecture. Integrating BERTRAM into BERT leads to large performance increases due to improved representations of rare and medium frequency words on both a rare word probing task and three downstream tasks.

06/12/2021

BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance

Timo Schick, Hinrich Schütze

Comments

Similar Papers

TriBERT: Human-centric Audio-visual Representation Learning

Tanzila Rahman, Mengyu Yang, Leonid Sigal

Keywords Abstract Paper

transformers, representation learning

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Boxin Wang, Shuohang Wang, Yu Cheng and Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Keywords Abstract Paper

adversarial training, QA, NLI, BERT, information theory, adversarial robustness

DagoBERT: Generating Derivational Morphology with a Pretrained Language Model

Valentin Hofmann, Janet Pierrehumbert, Hinrich Schütze

Keywords Abstract Paper

full finetuning, derivation generation, pretrained models, plms

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Jixuan Wang, Kuan-Chieh Wang, Frank Rudzicz, Michael Brudno

Keywords Abstract Paper

machine learning, transformers, meta learning, language, transfer learning

Syntactic Structure Distillation Pretraining for Bidirectional Encoders

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Abstract Paper

bert pretraining, structured tasks, natural understanding, textual learners

CxGBERT: BERT meets Construction Grammar

Harish Tayyar Madabushi, Laurence Romain, Dagmar Divjak, Petar Milin

Keywords Abstract Paper

A pairwise probe for understanding BERT fine-tuning on machine reading comprehension

Jie Cai, Zhengzhou Zhu, Ping Nie, Qian Liu

Keywords Abstract Paper

machine reading comprehension, pairwise, fine-tune, BERT

Incorporating BERT into Neural Machine Translation

Jinhua Zhu, Yingce Xia, Lijun Wu and Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tieyan Liu

Keywords Abstract Paper

BERT, Neural Machine Translation

Have We Solved The Hard Problem? It’s Not Easy! Contextual Lexical Contrast as a Means to Probe Neural Coherence

Wenqiang Lei, Yisong Miao, Runpeng Xie and Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Abstract Paper

Integrating Multimodal Information in Large Pretrained Transformers

Wasifur Rahman, Md Kamrul Hasan, Sangwu Lee and AmirAli Bagher Zadeh, Chengfeng Mao, Louis-Philippe Morency, Ehsan Hoque

Keywords Abstract Paper

NLP, lexical applications, modeling communication, multimodal analysis

Federated Learning for Spoken Language Understanding

Zhiqi Huang, Fenglin Liu, Yuexian Zou

Keywords Abstract Paper

Modelling General Properties of Nouns by Selectively Averaging Contextualised Embeddings

Na Li, Zied Bouraoui, Jose Camacho-Collados and Luis Espinosa-Anke, Qing Gu, Steven Schockaert

Keywords Abstract Paper

Natural Language Processing, Natural Language Semantics, Natural Language Processing

Syntactically Aware Cross-Domain Aspect and Opinion Terms Extraction

Oren Pereg, Daniel Korat, Moshe Wasserblat

Keywords Abstract Paper

On the Sentence Embeddings from Pre-trained Language Models

Bohan Li, Hao Zhou, Junxian He and Mingxuan Wang, Yiming Yang, Lei Li

Keywords Abstract Paper

natural processing, semantic task, semantic tasks, pre-trained representations

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and Anna Korhonen, Goran Glavaš

Keywords Abstract Paper

Asking without Telling: Exploring Latent Ontologies in Contextual Representations

Julian Michael, Jan A. Botha, Ian Tenney

Keywords Abstract Paper

pretrained encoders, elmo, bert, latent learning

A matter of framing: The impact of linguistic formalism on probing results

Ilia Kuznetsov, Iryna Gurevych

Keywords Abstract Paper

downstream tasks, pre-training, probing, linguistic tasks

A Simple Yet Strong Pipeline for HotpotQA

Dirk Groeneveld, Tushar Khot, Mausam, Ashish Sabharwal

Keywords Abstract Paper

multi-hop answering, named recognition, graph-based reasoning, question decomposition

What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models

Allyson Ettinger

Keywords Abstract Paper

Pre-training, NLP tasks, inference, role-based prediction

Bayesian Methods for Semi-supervised Text Annotation

Kristian Miok, Gregor Pirs, Marko Robnik-Sikonja

Keywords Abstract Paper

MPNet: Masked and Permuted Pre-training for Language Understanding

Kaitao Song, Xu Tan, Tao Qin and Jianfeng Lu, Tie-Yan Liu

Keywords Paper

Boxin Wang, Shuohang Wang, Yu Cheng and
Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Keywords Paper

Keywords Paper

Keywords Paper

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and
Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Paper

Keywords Paper

Keywords Paper

Jinhua Zhu, Yingce Xia, Lijun Wu and
Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tieyan Liu

Keywords Paper

Wenqiang Lei, Yisong Miao, Runpeng Xie and
Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Paper

Wasifur Rahman, Md Kamrul Hasan, Sangwu Lee and
AmirAli Bagher Zadeh, Chengfeng Mao, Louis-Philippe Morency, Ehsan Hoque

Keywords Paper

Keywords Paper

Na Li, Zied Bouraoui, Jose Camacho-Collados and
Luis Espinosa-Anke, Qing Gu, Steven Schockaert

Keywords Paper

Keywords Paper

Bohan Li, Hao Zhou, Junxian He and
Mingxuan Wang, Yiming Yang, Lei Li

Keywords Paper

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Kaitao Song, Xu Tan, Tao Qin and
Jianfeng Lu, Tie-Yan Liu

Keywords Paper

Keywords Paper

Linyang Li, Ruotian Ma, Qipeng Guo and
Xiangyang Xue, Xipeng Qiu

Keywords Paper

Keywords Paper

Yada Pruksachatkun, Jason Phang, Haokun Liu and
Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

Keywords Paper

Junghyun Min, R. Thomas McCoy, Dipanjan Das and
Emily Pitler, Tal Linzen

Keywords Paper

Keywords Paper

Junliang Guo, Zhirui Zhang, Linli Xu and
Hao-Ran Wei, Boxing Chen, Enhong Chen

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Ji Xin, Raphael Tang, Jaejun Lee and
Yaoliang Yu, Jimmy Lin

Keywords Paper

Bowen Wu, Huan Zhang, MengYuan Li and
Zongsheng Wang, Qihang Feng, Junhong Huang, Baoxun Wang

Keywords Paper

Xinshuai Dong, Anh Tuan Luu, Min Lin and
Shuicheng Yan, Hanwang Zhang

Keywords Paper

Pengyu Cheng, Weituo Hao, Siyang Yuan and
Shijing Si, Lawrence Carin

Keywords Paper

He Zhao, Dinh Phung, Viet Huynh and
Trung Le, Wray Buntine

Keywords Paper