Pseudo-Masked Language Models for Unified Language Model Pre-Training

12/07/2020

Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, Hsiao-Wuen Hon

Keywords: Applications - Language, Speech and Dialog

Abstract Paper Similar Papers

Abstract: We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Minki Kang, Moonsu Han, Sung Ju Hwang

Keywords Paper

self-supervised pre-training, question answering, task, reinforcement learning

0

0

0

0

12:00

16/11/2020

Cross-Thought for Sentence Encoder Pre-training

Shuohang Wang, Yuwei Fang, Siqi Sun and
Zhe Gan, Yu Cheng, Jingjing Liu, Jing Jiang

Keywords Paper

pre-training encoder, large-scale tasks, question answering, predicting words

0

0

0

0

12:06

03/05/2021

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Boxin Wang, Shuohang Wang, Yu Cheng and
Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Keywords Paper

adversarial training, QA, NLI, BERT, information theory, adversarial robustness

0

0

0

0

5:21

03/05/2021

Structured Prediction as Translation between Augmented Natural Languages

Giovanni Paolini, Ben Athiwaratkun, Jason Krone and
Jie Ma, Alessandro Achille, RISHITA ANUBHAI, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto

Keywords Paper

sequence to sequence, structured prediction, language models, transfer learning, few-shot learning, multi-task learning, generative modeling

0

0

0

0

12:16

04/07/2020

Emerging Cross-lingual Structure in Pretrained Language Models

Alexis Conneau, Shijie Wu, Haoran Li and
Luke Zettlemoyer, Veselin Stoyanov

Keywords Paper

multilingual modeling, cross-lingual transfer, transfer, Cross-lingual Models

0

0

0

0

11:49

22/11/2021

From Seq2Seq Recognition to Handwritten Word Embeddings

George Retsinas, Giorgos Sfikas, Christophoros Nikou, Petros Maragos

Keywords Paper

keyword spotting, handwritten text recognition, sequence-to-sequence

0

0

0

0

2:59

08/12/2020

CharBERT: Character-aware Pre-trained Language Model

Wentao Ma, Yiming Cui, Chenglei Si and
Ting Liu, Shijin Wang, Guoping Hu

Keywords Paper

0

0

0

0

14:20

16/11/2020

CSP:Code-Switching Pre-training for Neural Machine Translation

Zhen Yang, Bojie Hu, Ambyera Han and
Shen Huang, Qi Ju

Keywords Paper

neural nmt, lexicon induction, unsupervised nmt, pre-training method

0

0

0

0

10:10

16/11/2020

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

Ikuya Yamada, Akari Asai, Hiroyuki Shindo and
Hideaki Takeda, Yuji Matsumoto

Keywords Paper

natural tasks, pretraining task, transformer, entity-related tasks

0

0

0

0

11:15

06/12/2021

A Framework to Learn with Interpretation

Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Keywords Paper

deep learning, interpretability

0

0

0

0

14:05

08/12/2020

A Mixture-of-Experts Model for Learning Multi-Facet Entity Embeddings

Rana Alshaikh, Zied Bouraoui, Shelan Jeawak, Steven Schockaert

Keywords Paper

0

0

0

0

14:13

19/08/2021

Progressive Open-Domain Response Generation with Multiple Controllable Attributes

Haiqin Yang, Xiaoyuan Yao, Yiqun Duan and
Jianping Shen, Jie Zhong, Kun Zhang

Keywords Paper

Machine Learning, Learning Generative Models, Dialogue

0

0

0

0

14:43

14/06/2020

MaskGAN: Towards Diverse and Interactive Facial Image Manipulation

Cheng-Han Lee, Ziwei Liu, Lingyun Wu, Ping Luo

Keywords Paper

facial image manipulation, face segmentation, image synthesis, generative adversarial network

0

0

0

0

1:00

06/12/2021

Mixed Supervised Object Detection by Transferring Mask Prior and Semantic Similarity

Yan Liu, Zhijie Zhang, Li Niu and
Junjie Chen, Liqing Zhang

Keywords Paper

vision, transfer learning

0

0

0

0

9:11

16/11/2020

PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation

Bin Bi, Chenliang Li, Chen Wu and
Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si

Keywords Paper

natural generation, language tasks, generative answering, conversational generation

0

0

0

0

11:02

19/08/2021

Exemplification Modeling: Can You Give Me an Example, Please?

Edoardo Barba, Luigi Procopio, Caterina Lacerra and
Tommaso Pasini, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:47

03/05/2021

Rethinking Positional Encoding in Language Pre-training

Guolin Ke, Di He, Tie-Yan Liu

Keywords Paper

Natural Language Processing, Pre-training

0

0

0

0

4:49

06/12/2021

Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers

Yanhong Zeng, Huan Yang, Hongyang Chao and
Jianbo Wang, Jianlong Fu

Keywords Paper

transformers, generative model

0

0

0

0

9:28

16/11/2020

Local Additivity Based Data Augmentation for Semi-supervised NER

Jiaao Chen, Zhenghui Wang, Ran Tian and
Zichao Yang, Diyi Yang

Keywords Paper

named recognition, deep understanding, semi-supervised ner, entity learning

0

0

0

0

11:18

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

14/06/2020

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang

Keywords Paper

data augmentation, text recognition, joint training

0

0

0

0

0:59

04/07/2020

Generalizing Natural Language Analysis through Span-relation Representations

Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig

Keywords Paper

Natural Analysis, Natural processing, dependency parsing, semantic labeling

0

0

0

0

8:30

03/05/2021

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

0

0

0

0

3:51

23/06/2021

SPPL: Probabilistic Programming with Fast Exact Symbolic Inference

Feras A. Saad, Martin C. Rinard, Vikash K. Mansinghka

Keywords Paper

probabilistic programming, symbolic execution

0

0

0

0

23:41

05/01/2021

Automatic Object Recoloring Using Adversarial Learning

Siavash Khodadadeh, Saeid Motiian, Zhe Lin and
Ladislau Boloni, Shabnam Ghadar

Keywords Paper

0

0

0

0

4:43

08/12/2020

Probing Multimodal Embeddings for Linguistic Properties: the Visual-Semantic Case

Adam Dahlgren Lindström, Johanna Björklund, Suna Bensch, Frank Drewes

Keywords Paper

0

0

0

0

14:20

19/08/2021

Disentangled Face Attribute Editing via Instance-Aware Latent Space Search

Yuxuan Han, Jiaolong Yang, Ying Fu

Keywords Paper

Computer Vision, 2D and 3D Computer Vision, Explainable/Interpretable Machine Learning

0

0

0

0

12:51

05/12/2020

Named entity recognition in multi-level contexts

Yubo Chen, Chuhan Wu, Tao Qi and
Zhigang Yuan, Yongfeng Huang

Keywords Paper

0

0

0

0

14:10

04/07/2020

Unsupervised Word Translation with Adversarial Autoencoder

Tasnim Mohiuddin, Shafiq Joty

Keywords Paper

Unsupervised Translation, machine translation, transfer learning, word task

0

0

0

0

14:56

02/02/2021

Towards Semantics-Enhanced Pre-Training: Can Lexicon Definitions Help Learning Sentence Meanings?

Xuancheng Ren, Xu Sun, Houfeng Wang, Qun Liu

Keywords Paper

0

0

0

0

16:04

06/12/2021

Integrating Tree Path in Transformer for Code Representation

Han Peng, Ge Li, Wenhan Wang and
YunFei Zhao, Zhi Jin

Keywords Paper

machine learning, transformers

0

0

0

0

4:42

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

19/04/2021

Generating syntactically controlled paraphrases without using annotated parallel pairs

Kuan-Hao Huang, Kai-Wei Chang

Keywords Paper

0

0

0

1

10:41

16/11/2020

Asking without Telling: Exploring Latent Ontologies in Contextual Representations

Julian Michael, Jan A. Botha, Ian Tenney

Keywords Paper

pretrained encoders, elmo, bert, latent learning

0

0

0

0

12:45

26/04/2020

Learning Robust Representations via Multi-View Information Bottleneck

Marco Federici, Anjan Dutta, Patrick Forré and
Nate Kushman, Zeynep Akata

Keywords Paper

Information Bottleneck, Multi-View Learning, Representation Learning, Information Theory

0

0

0

0

4:56

16/11/2020

Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining

Chengyu Wang, Minghui Qiu, Jun Huang, Xiaofeng He

Keywords Paper

nlp tasks, fine-tuning, learning process, multi-domain tasks

0

0

0

0

9:58

30/11/2020

Show, Conceive and Tell: Image Captioning with Prospective Linguistic Information

Yiqing Huang, Jiansheng Chen

Keywords Paper

0

0

0

0

7:08

03/05/2021

Pre-training Text-to-Text Transformers for Concept-centric Common Sense

Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam and
Seyeon Lee, Xiang Ren

Keywords Paper

Self-supervised Learning, Commonsense Reasoning, Language Model Pre-training

0

0

0

0

4:56

14/06/2020

Gait Recognition via Semi-supervised Disentangled Representation Learning to Identity and Covariate Features

Xiang Li, Yasushi Makihara, Chi Xu and
Yasushi Yagi, Mingwu Ren

Keywords Paper

gait recognition, semi-supervised disentangled representation learningcovariate

0

0

0

0

1:01

16/11/2020

Learning to Represent Image and Text with Denotation Graph

Bowen Zhang, Hexiang Hu, Vihan Jain and
Eugene Ie, Fei Sha

Keywords Paper

cross-modal retrieval, referring expression, compositional recognition, pre-training

0

0

0

0

10:59