PatchBERT: Just-in-Time, Out-of-Vocabulary Patching

16/11/2020

PatchBERT: Just-in-Time, Out-of-Vocabulary Patching

Sangwhan Moon, Naoaki Okazaki

Keywords: natural processing, downstream tasks, mitigation, large models

Abstract Paper Similar Papers

Abstract: Large scale pre-trained language models have shown groundbreaking performance improvements for transfer learning in the domain of natural language processing. In our paper, we study a pre-trained multilingual BERT model and analyze the OOV rate on downstream tasks, how it introduces information loss, and as a side-effect, obstructs the potential of the underlying model. We then propose multiple approaches for mitigation and demonstrate that it improves performance with the same parameter count when combined with fine-tuning.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

Improving Transformation Invariance in Contrastive Representation Learning

Adam Foster, Rattana Pukdee, Tom Rainforth

Keywords Paper

transformation invariance, contrastive learning, representation learning

0

0

0

0

5:23

06/12/2020

Multi-Stage Influence Function

Hongge Chen, Si Si, Yang Li and
Ciprian Chelba, Sanjiv Kumar, Duane Boning, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

3:23

19/04/2021

Cross-lingual visual pre-training for multimodal machine translation

Ozan Caglayan, Menekse Kuyu, Mustafa Sercan Amac and
Pranava Madhyastha, Erkut Erdem, Aykut Erdem, Lucia Specia

Keywords Paper

0

0

0

0

6:16

06/12/2020

Do Adversarially Robust ImageNet Models Transfer Better?

Hadi Salman, Andrew Ilyas, Logan Engstrom and
Ashish Kapoor, Aleksander Madry

Keywords Paper

0

0

0

0

4:16

26/04/2020

Compositional languages emerge in a neural iterated learning model

Yi Ren, Shangmin Guo, Matthieu Labeau and
Shay B. Cohen, Simon Kirby

Keywords Paper

Compositionality, Multi-agent, Emergent language, Iterated learning

0

0

0

0

5:07

16/11/2020

Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining

Chengyu Wang, Minghui Qiu, Jun Huang, Xiaofeng He

Keywords Paper

nlp tasks, fine-tuning, learning process, multi-domain tasks

0

0

0

0

9:58

26/04/2020

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model

Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov

Keywords Paper

0

0

0

0

5:00

06/12/2021

DOBF: A Deobfuscation Pre-Training Objective for Programming Languages

Marie-Anne Lachaux, Baptiste Roziere, Marc Szafraniec, Guillaume Lample

Keywords Paper

self-supervised learning

0

0

0

0

13:09

16/11/2020

Improving Grammatical Error Correction Models with Purpose-Built Adversarial Examples

Lihao Wang, Xiaoqing Zheng

Keywords Paper

grammatical correction, sequence-to-sequence learning, neural networks, gec

0

0

0

0

11:40

03/05/2021

Rethinking Embedding Coupling in Pre-trained Language Models

Hyung Won Chung, Thibault Fevry, Henry Tsai and
Melvin Johnson, Sebastian Ruder

Keywords Paper

transfer learning, pre-training, natural language processing, efficiency

0

0

0

0

5:12

18/07/2021

Improved OOD Generalization via Adversarial Training and Pretraing

Mingyang Yi, Lu Hou, Jiacheng Sun and
Lifeng Shang, Xin Jiang, Qun Liu, Zhiming Ma

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

4:11

16/11/2020

Exploring and Predicting Transferability across NLP Tasks

Tu Vu, Tong Wang, Tsendsuren Munkhdalai and
Alessandro Sordoni, Adam Trischler, Andrew Mattarella-Micke, Subhransu Maji, Mohit Iyyer

Keywords Paper

language modeling, nlp tasks, text classification, question answering

0

0

0

0

10:55

16/11/2020

Dynamic Data Selection and Weighting for Iterative Back-Translation

Zi-Yi Dou, Antonios Anastasopoulos, Graham Neubig

Keywords Paper

neural translation, neural nmt, nmt, domain adaptation

0

0

0

0

11:30

16/11/2020

Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting

Sanyuan Chen, Yutai Hou, Yiming Cui and
Wanxiang Che, Ting Liu, Xiangzhan Yu

Keywords Paper

pretraining, pretraining tasks, learning tasks, fine-tuning bert-large

0

0

0

1

10:52

16/11/2020

Multi-Stage Pre-training for Low-Resource Domain Adaptation

Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan and
Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avi Sil, Todd Ward

Keywords Paper

nlp tasks, fine-tuning, auxiliary tasks, lm transfer

0

0

0

0

6:56

16/11/2020

SLM: Learning a Discourse Language Representation with Sentence Unshuffling

Haejun Lee, Drew A. Hudson, Kangwook Lee, Christopher D. Manning

Keywords Paper

nlp, sentence-level modeling, discourse representation, pre-training methods

0

0

0

0

9:21

16/11/2020

Cross-Thought for Sentence Encoder Pre-training

Shuohang Wang, Yuwei Fang, Siqi Sun and
Zhe Gan, Yu Cheng, Jingjing Liu, Jing Jiang

Keywords Paper

pre-training encoder, large-scale tasks, question answering, predicting words

0

0

0

0

12:06

19/08/2021

Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation

Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu and
Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation, NLP Applications and Tools

0

0

0

0

13:53

04/07/2020

Roles and Utilization of Attention Heads in Transformer-based Neural Language Models

Jae-young Jo, Sung-Hyon Myaeng

Keywords Paper

Transformer-based Models, natural tasks, downstream tasks, probing tasks

0

0

0

0

12:17

04/07/2020

To Pretrain or Not to Pretrain: Examining the Benefits of Pretrainng on Resource Rich Tasks

Sinong Wang, Madian Khabsa, Hao Ma

Keywords Paper

text tasks, Pretrainng, Pretraining models, NLP models

0

0

0

0

6:10

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

22/11/2021

Meta-learning the Learning Trends Shared Across Tasks

Jathushan Rajasegaran, Salman Khan, Munawar Hayat and
Fahad Shahbaz Khan, Mubarak Shah

Keywords Paper

Meta-learning, Few-shot learning

0

0

0

0

2:38

19/04/2021

Does the order of training samples matter? Improving neural data-to-text generation with curriculum learning

Ernie Chang, Hui-Syuan Yeh, Vera Demberg

Keywords Paper

0

0

0

0

5:42

05/01/2021

Analyzing Deep Neural Network's Transferability via Frechet Distance

Yifan Ding, Liqiang Wang, Boqing Gong

Keywords Paper

0

0

0

0

4:59

26/04/2020

Understanding Knowledge Distillation in Non-autoregressive Machine Translation

Chunting Zhou, Jiatao Gu, Graham Neubig

Keywords Paper

knowledge distillation, non-autoregressive neural machine translation

0

0

0

0

4:55

08/12/2020

Incremental Neural Lexical Coherence Modeling

Sungho Jeon, Michael Strube

Keywords Paper

0

0

0

0

9:08

16/11/2020

Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Minki Kang, Moonsu Han, Sung Ju Hwang

Keywords Paper

self-supervised pre-training, question answering, task, reinforcement learning

0

0

0

0

12:00

05/12/2020

Investigating learning dynamics of BERT fine-tuning

Yaru Hao, Li Dong, Furu Wei, Ke Xu

Keywords Paper

0

0

0

0

7:10

19/04/2021

Effects of pre- and post-processing on type-based embeddings in lexical semantic change detection

Jens Kaiser, Sinan Kurtyigit, Serge Kotchourko, Dominik Schlechtweg

Keywords Paper

0

0

0

0

12:03

16/11/2020

Self-Paced Learning for Neural Machine Translation

Yu Wan, Baosong Yang, Derek F. Wong and
Yikai Zhou, Lidia S. Chao, Haibo Zhang, Boxing Chen

Keywords Paper

neural, curriculum learning, translation tasks, nmt

0

0

0

0

6:03

08/12/2020

Joint Training for Learning Cross-lingual Embeddings with Sub-word Information without Parallel Corpora

Ali Hakimi Parizi, Paul Cook

Keywords Paper

0

0

0

0

12:38

22/11/2021

Cross-Modal Generative Augmentation for Visual Question Answering

Zixu Wang, Yishu Miao, Lucia Specia

Keywords Paper

visual question answering, data augmentation, generative model, multimodal machine learning

0

0

0

0

2:49

04/07/2020

Unsupervised Word Translation with Adversarial Autoencoder

Tasnim Mohiuddin, Shafiq Joty

Keywords Paper

Unsupervised Translation, machine translation, transfer learning, word task

0

0

0

0

14:56

16/11/2020

Controllable Meaning Representation to Text Generation: Linearization and Data Augmentation Strategies

Chris Kedzie, Kathleen McKeown

Keywords Paper

natural generation, training, data augmentation, neural models

0

0

0

0

11:16

04/07/2020

How does BERT's attention change when you fine-tune? An analysis methodology and a case study in negation scope

Yiyun Zhao, Steven Bethard

Keywords Paper

downstream task, NLP problems, knowledge-related tasks, downstream tasks

0

0

0

0

11:43

16/11/2020

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Xiaomian Kang, Yang Zhao, Jiajun Zhang, Chengqing Zong

Keywords Paper

document-level translation, translations, document-level model, selection module

0

0

0

0

11:36

26/04/2020

Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

Cheolhyoung Lee, Kyunghyun Cho, Wanmo Kang

Keywords Paper

regularization, finetuning, dropout, dropconnect, adaptive L2-penalty, BERT, pretrained language model

0

0

0

0

5:04

02/02/2021

A Unified Pretraining Framework for Passage Ranking and Expansion

Ming Yan, Chenliang Li, Bin Bi and
Wei Wang, Songfang Huang

Keywords Paper

0

0

0

0

16:33

03/05/2021

Deberta: Decoding-Enhanced Bert With Disentangled Attention

Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen

Keywords Paper

Position Encoding, Attention, Natural Language Processing, Language Model Pre-training, Transformer

0

0

0

0

6:06

26/08/2020

Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer

Yanshuai Cao, Peng Xu

Keywords Paper

0

0

0

0

15:00