BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

04/07/2020

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, Luke Zettlemoyer

Keywords: Natural Generation, Translation, Comprehension, pretraining models

Abstract Paper Similar Papers

Abstract: We present BART, a denoising autoencoder for pretraining sequence-to-sequence models. BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Tranformer-based neural machine translation architecture which, despite its simplicity, can be seen as generalizing BERT (due to the bidirectional encoder), GPT (with the left-to-right decoder), and other recent pretraining schemes. We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token. BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains of up to 3.5 ROUGE. BART also provides a 1.1 BLEU increase over a back-translation system for machine translation, with only target language pretraining. We also replicate other pretraining schemes within the BART framework, to understand their effect on end-task performance.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

08/12/2020

CharBERT: Character-aware Pre-trained Language Model

Wentao Ma, Yiming Cui, Chenglei Si and
Ting Liu, Shijin Wang, Guoping Hu

Keywords Paper

0

0

0

0

14:20

06/12/2021

BARTScore: Evaluating Generated Text as Text Generation

Weizhe Yuan, Graham Neubig, Pengfei Liu

Keywords Paper

0

0

0

0

13:47

16/11/2020

Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU

Brielen Madureira, David Schlangen

Keywords Paper

nlp, interactive systems, language encoders, bidirectional lstms

0

0

0

0

10:04

06/12/2021

COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

Yu Meng, Chenyan Xiong, Payal Bajaj and
saurabh tiwary, Paul Bennett, Jiawei Han, XIA SONG

Keywords Paper

self-supervised learning, contrastive learning

0

0

0

0

14:23

16/11/2020

Simulated multiple reference training improves low-resource machine translation

Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn

Keywords Paper

machine mt, mt, simulated training, simulated

0

0

0

0

6:56

08/12/2020

Transliteration of Judeo-Arabic Texts into Arabic Script Using Recurrent Neural Networks

Ori Terner, Kfir Bar, Nachum Dershowitz

Keywords Paper

0

0

0

0

9:50

06/12/2021

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

Yichong Leng, Xu Tan, Linchen Zhu and
Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiangyang Li, Edward Lin, Tie-Yan Liu

Keywords Paper

0

0

0

0

13:44

14/06/2020

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Runfa Chen, Wenbing Huang, Binghui Huang and
Fuchun Sun, Bin Fang

Keywords Paper

nice-gan, reusing discriminators for encoding, unsupervised image-to-image translation, decoupled training, multi-scale discriminators, adversarial loss, no independent component for encoding, shared layers, residual attention, cyclegan

0

0

0

0

1:01

06/12/2020

Incorporating BERT into Parallel Sequence Decoding with Adapters

Junliang Guo, Zhirui Zhang, Linli Xu and
Hao-Ran Wei, Boxing Chen, Enhong Chen

Keywords Paper

0

0

0

0

3:17

22/11/2021

From Seq2Seq Recognition to Handwritten Word Embeddings

George Retsinas, Giorgos Sfikas, Christophoros Nikou, Petros Maragos

Keywords Paper

keyword spotting, handwritten text recognition, sequence-to-sequence

0

0

0

0

2:59

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

02/02/2021

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps

Qi Zhu, Chenyu Gao, Peng Wang, Qi Wu

Keywords Paper

0

0

0

0

15:58

02/02/2021

Non-Autoregressive Coarse-to-Fine Video Captioning

Bang Yang, Yuexian Zou, Fenglin Liu, Can Zhang

Keywords Paper

0

0

0

0

18:21

14/06/2020

OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold

Mohamed Yousef, Tom E. Bishop

Keywords Paper

text recognition, weakly supervised, handwriting recognition, convolutional neural network fully convolutional, ctc

0

0

0

0

1:00

14/06/2020

SCATTER: Selective Context Attentional Scene Text Recognizer

Ron Litman, Oron Anschel, Shahar Tsiper and
Roee Litman, Shai Mazor, R. Manmatha

Keywords Paper

text recognition, scene text, irregular text, stacked decoders, repetitive processing, intermediate supervision, attention decoder, two-step attention, sequence modeling, deep lstm

0

0

0

0

0:55

08/12/2020

Infusing Sequential Information into Conditional Masked Translation Model with Self-Review Mechanism

Pan Xie, Zhi Cui, Xiuying Chen and
XiaoHui Hu, Jianwei Cui, Bin Wang

Keywords Paper

0

0

0

0

6:43

30/11/2020

Show, Conceive and Tell: Image Captioning with Prospective Linguistic Information

Yiqing Huang, Jiansheng Chen

Keywords Paper

0

0

0

0

7:08

04/07/2020

BPE-Dropout: Simple and Effective Subword Regularization

Ivan Provilkov, Dmitrii Emelianenko, Elena Voita

Keywords Paper

open problem, machine translation, subword segmentation, training

0

0

0

0

9:33

08/12/2020

Detecting Urgency Status of Crisis Tweets: A Transfer Learning Approach for Low Resource Languages

Efsun Sarioglu Kayi, Linyong Nan, Bohan Qu and
Mona Diab, Kathleen McKeown

Keywords Paper

0

0

0

0

14:37

18/07/2021

Self-supervised and Supervised Joint Training for Resource-rich Machine Translation

Yong Cheng, Wei Wang, Lu Jiang, Wolfgang Macherey

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

5:21

08/12/2020

Attentively Embracing Noise for Robust Latent Representation in BERT

Gwenaelle Cunha Sergio, Dennis Singh Moirangthem, Minho Lee

Keywords Paper

0

0

0

0

12:55

16/11/2020

Simultaneous Machine Translation with Visual Context

Ozan Caglayan, Julia Ive, Veneta Haralampieva and
Pranava Madhyastha, Loïc Barrault, Lucia Specia

Keywords Paper

simt, multimodal approaches, simt frameworks, visually-grounded models

0

0

0

0

12:34

16/11/2020

oLMpics - On what Language Model Pre-training Captures

Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant

Keywords Paper

symbolic tasks, reasoning tasks, zero-shot evaluation, pre-training

0

0

0

0

10:01

12/07/2020

Aligned Cross Entropy for Non-Autoregressive Machine Translation

Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

14:43

19/04/2021

Disfluency correction using unsupervised and semi-supervised learning

Nikhil Saini, Drumil Trivedi, Shreya Khare and
Tejas Dhamecha, Preethi Jyothi, Samarth Bharadwaj, Pushpak Bhattacharyya

Keywords Paper

0

0

0

0

7:13

04/07/2020

Adaptive Compression of Word Embeddings

Yeachan Kim, Kang-Min Kim, SangKeun Lee

Keywords Paper

Adaptive Embeddings, Distributed words, natural tasks, downstream tasks

0

0

0

0

12:13

16/11/2020

Q-learning with Language Model for Edit-based Unsupervised Summarization

Ryosuke Kohita, Akifumi Wachi, Yang Zhao, Ryuki Tachibana

Keywords Paper

abstractive textsummarization, unsupervised summarization, unsupervised summarizers, unsupervised methods

0

0

0

0

12:32

06/12/2020

MPNet: Masked and Permuted Pre-training for Language Understanding

Kaitao Song, Xu Tan, Tao Qin and
Jianfeng Lu, Tie-Yan Liu

Keywords Paper

0

0

0

0

3:23

04/07/2020

Learning Spoken Language Representations with Neural Lattice Language Modeling

Chao-Wei Huang, Yun-Nung Chen

Keywords Paper

NLP tasks, spoken tasks, intent detection, Spoken Representations

0

0

0

0

6:39

05/01/2021

Multimodal Prototypical Networks for Few-Shot Learning

Frederik Pahde, Mihai Puscas, Tassilo Klein, Moin Nabi

Keywords Paper

0

0

0

0

4:56

12/07/2020

Non-autoregressive Translation with Disentangled Context Transformer

Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

15:00

19/08/2021

Improving Text Generation with Dynamic Masking and Recovering

Zhidong Liu, Junhui Li, Muhua Zhu

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation

0

0

0

0

13:44

14/06/2020

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen and
Shai Mazor, Roee Litman

Keywords Paper

gan, semi-supervised, domain-adaptation, handwriting, generative, unlabeled, transfer learning, ocr, text, augmentation

0

0

0

0

1:01

16/11/2020

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Linyang Li, Ruotian Ma, Qipeng Guo and
Xiangyang Xue, Xipeng Qiu

Keywords Paper

adversarial attacks, downstream tasks, calculation, gradient-based methods

0

0

0

0

11:36

04/07/2020

Paraphrase Generation by Learning How to Edit from Samples

Amirhossein Kazemnejad, Mohammadreza Salehi, Mahdieh Soleymani Baghshah

Keywords Paper

Paraphrase Generation, Neural sequence, sequence generation, retrieval-based method

0

0

0

0

12:20

02/02/2021

Adaptive Beam Search Decoding for Discrete Keyphrase Generation

Xiaoli Huang, Tongge Xu, Lvan Jiao and
Yueran Zu, Youmin Zhang

Keywords Paper

0

0

0

0

14:36

16/11/2020

Sound Natural: Content Rephrasing in Dialog Systems

Arash Einolghozati, Anchit Gupta, Keith Diedrick, Sonal Gupta

Keywords Paper

rephrasing, messaging, virtual assistants, intent-slot tagging

0

0

0

0

5:49

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

02/02/2021

C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

Yutai Hou, Sanyuan Chen, Wanxiang Che and
Cheng Chen, Ting Liu

Keywords Paper

0

0

0

0

15:01