CharBERT: Character-aware Pre-trained Language Model

08/12/2020

CharBERT: Character-aware Pre-trained Language Model

Wentao Ma, Yiming Cui, Chenglei Si, Ting Liu, Shijin Wang, Guoping Hu

Keywords:

Abstract Paper Similar Papers

Abstract: Most pre-trained language models (PLMs) construct word representations at subword level with Byte-Pair Encoding (BPE) or its variations, by which OOV (out-of-vocab) words are almost avoidable. However, those methods split a word into subword units and make the representation incomplete and fragile.In this paper, we propose a character-aware pre-trained language model named CharBERT improving on the previous methods (such as BERT, RoBERTa) to tackle these problems. We first construct the contextual word embedding for each token from the sequential character representations, then fuse the representations of characters and the subword representations by a novel heterogeneous interaction module. We also propose a new pre-training task named NLM (Noisy LM) for unsupervised character representation learning. We evaluate our method on question answering, sequence labeling, and text classification tasks, both on the original datasets and adversarial misspelling test sets. The experimental results show that our method can significantly improve the performance and robustness of PLMs simultaneously.

The video of this talk cannot be embedded. You can watch it here:

https://underline.io/lecture/6199-charbert-character-aware-pre-trained-language-model

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at COLING 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation

Bin Bi, Chenliang Li, Chen Wu and
Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si

Keywords Paper

natural generation, language tasks, generative answering, conversational generation

0

0

0

0

11:02

03/05/2021

Rethinking Positional Encoding in Language Pre-training

Guolin Ke, Di He, Tie-Yan Liu

Keywords Paper

Natural Language Processing, Pre-training

0

0

0

0

4:49

04/07/2020

Adaptive Compression of Word Embeddings

Yeachan Kim, Kang-Min Kim, SangKeun Lee

Keywords Paper

Adaptive Embeddings, Distributed words, natural tasks, downstream tasks

0

0

0

0

12:13

19/08/2021

Improving Text Generation with Dynamic Masking and Recovering

Zhidong Liu, Junhui Li, Muhua Zhu

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation

0

0

0

0

13:44

16/11/2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

Keywords Paper

nlp applications, fine-tuning, meta-learning problem, supervised tasks

0

0

0

0

11:49

12/07/2020

Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao, Li Dong, Furu Wei and
Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, Hsiao-Wuen Hon

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

13:55

16/11/2020

Cross-Thought for Sentence Encoder Pre-training

Shuohang Wang, Yuwei Fang, Siqi Sun and
Zhe Gan, Yu Cheng, Jingjing Liu, Jing Jiang

Keywords Paper

pre-training encoder, large-scale tasks, question answering, predicting words

0

0

0

0

12:06

04/07/2020

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Mike Lewis, Yinhan Liu, Naman Goyal and
Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, Luke Zettlemoyer

Keywords Paper

Natural Generation, Translation, Comprehension, pretraining models

0

0

0

0

11:22

01/07/2020

DeepMet: A Reading Comprehension Paradigm for Token-level Metaphor Detection

Chuandong Su, Fumiyo Fukumoto, Xiaoxi Huang and
Jiyi Li, Rongbo Wang, Zhiqun Chen

Keywords Paper

0

0

0

0

10:37

14/06/2020

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen and
Shai Mazor, Roee Litman

Keywords Paper

gan, semi-supervised, domain-adaptation, handwriting, generative, unlabeled, transfer learning, ocr, text, augmentation

0

0

0

0

1:01

06/12/2020

Language Through a Prism: A Spectral Approach for Multiscale Language Representations

Alex Tamkin, Dan Jurafsky, Noah Goodman

Keywords Paper

0

0

0

0

3:34

05/12/2020

Named entity recognition in multi-level contexts

Yubo Chen, Chuhan Wu, Tao Qi and
Zhigang Yuan, Yongfeng Huang

Keywords Paper

0

0

0

0

14:10

19/04/2021

On the evolution of syntactic information encoded by BERT’s contextualized representations

Laura Pérez-Mayos, Roberto Carlini, Miguel Ballesteros, Leo Wanner

Keywords Paper

0

0

0

0

6:50

16/11/2020

Assessing Phrasal Representation and Composition in Transformers

Lang Yu, Allyson Ettinger

Keywords Paper

nlp tasks, systematic representations, deep models, phrasal representations

0

0

0

0

11:33

04/07/2020

Contextualized Weak Supervision for Text Classification

Dheeraj Mekala, Jingbo Shang

Keywords Paper

Text Classification, Weakly classification, string matching, Contextualized Supervision

0

0

0

0

11:26

16/11/2020

SLM: Learning a Discourse Language Representation with Sentence Unshuffling

Haejun Lee, Drew A. Hudson, Kangwook Lee, Christopher D. Manning

Keywords Paper

nlp, sentence-level modeling, discourse representation, pre-training methods

0

0

0

0

9:21

08/12/2020

A Semantically Consistent and Syntactically Variational Encoder-Decoder Framework for Paraphrase Generation

Wenqing Chen, Jidong Tian, Liqiang Xiao and
Hao He, Yaohui Jin

Keywords Paper

0

0

0

0

14:50

16/11/2020

Train No Evil: Selective Masking for Task-Guided Pre-Training

Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang and
Zhiyuan Liu, Maosong Sun

Keywords Paper

pre-training stage, fine-tuning stage, general pre-training, sentiment tasks

0

0

0

0

7:02

22/11/2021

From Seq2Seq Recognition to Handwritten Word Embeddings

George Retsinas, Giorgos Sfikas, Christophoros Nikou, Petros Maragos

Keywords Paper

keyword spotting, handwritten text recognition, sequence-to-sequence

0

0

0

0

2:59

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

06/12/2021

COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

Yu Meng, Chenyan Xiong, Payal Bajaj and
saurabh tiwary, Paul Bennett, Jiawei Han, XIA SONG

Keywords Paper

self-supervised learning, contrastive learning

0

0

0

0

14:23

16/11/2020

Local Additivity Based Data Augmentation for Semi-supervised NER

Jiaao Chen, Zhenghui Wang, Ran Tian and
Zichao Yang, Diyi Yang

Keywords Paper

named recognition, deep understanding, semi-supervised ner, entity learning

0

0

0

0

11:18

16/11/2020

Pre-training Entity Relation Encoder with Intra-span and Inter-span Information

Yijun Wang, Changzhi Sun, Yuanbin Wu and
Junchi Yan, Peng Gao, Guotong Xie

Keywords Paper

entity task, pre-trained encoder, general-purpose encoder, universal models

0

0

0

0

11:19

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

12/07/2020

On Variational Learning of Controllable Representations for Text without Supervision

Peng Xu, Jackie Chi Kit Cheung, Yanshuai Cao

Keywords Paper

Representation Learning

0

0

0

0

14:51

16/11/2020

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Xiaomian Kang, Yang Zhao, Jiajun Zhang, Chengqing Zong

Keywords Paper

document-level translation, translations, document-level model, selection module

0

0

0

0

11:36

16/11/2020

Generating Dialogue Responses from a Semantic Latent Space

Wei-Jen Ko, Avik Ray, Yilin Shen, Hongxia Jin

Keywords Paper

generation responses, regression task, open-domain models, end-to-end classification

0

0

0

0

11:26

04/07/2020

BPE-Dropout: Simple and Effective Subword Regularization

Ivan Provilkov, Dmitrii Emelianenko, Elena Voita

Keywords Paper

open problem, machine translation, subword segmentation, training

0

0

0

0

9:33

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

02/02/2021

Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks

Maurício Gruppi, Pin-Yu Chen, Sibel Adali

Keywords Paper

0

0

0

0

19:35

25/07/2020

Leveraging adversarial training in self-learning for cross-lingual text classification

Xin Dong, Yaxin Zhu, Yupeng Zhang and
Zuohui Fu, Dongkuan Xu, Sen Yang, Gerard Melo

Keywords Paper

multilingual, semantics, text classification, cross-lingual

0

0

0

0

9:19

02/02/2021

Towards Semantics-Enhanced Pre-Training: Can Lexicon Definitions Help Learning Sentence Meanings?

Xuancheng Ren, Xu Sun, Houfeng Wang, Qun Liu

Keywords Paper

0

0

0

0

16:04

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

19/04/2021

Generating syntactically controlled paraphrases without using annotated parallel pairs

Kuan-Hao Huang, Kai-Wei Chang

Keywords Paper

0

0

0

1

10:41

19/08/2021

Unsupervised Hashing with Contrastive Information Bottleneck

Zexuan Qiu, Qinliang Su, Zijing Ou and
Jianxing Yu, Changyou Chen

Keywords Paper

Computer Vision, Recognition, Statistical Methods and Machine Learning

0

0

0

0

12:40

03/05/2021

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

0

0

0

0

3:51

16/11/2020

CSP:Code-Switching Pre-training for Neural Machine Translation

Zhen Yang, Bojie Hu, Ambyera Han and
Shen Huang, Qi Ju

Keywords Paper

neural nmt, lexicon induction, unsupervised nmt, pre-training method

0

0

0

0

10:10

19/08/2021

Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation

Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu and
Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation, NLP Applications and Tools

0

0

0

0

13:53

02/02/2021

C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

Yutai Hou, Sanyuan Chen, Wanxiang Che and
Cheng Chen, Ting Liu

Keywords Paper

0

0

0

0

15:01