Multilingual Denoising Pre-training for Neural Machine Translation

16/11/2020

Multilingual Denoising Pre-training for Neural Machine Translation

Jiatao Gu, Yinhan Liu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer

Keywords: machine tasks, pre-training, multilingual pre-training, mbart

Abstract Paper Similar Papers

Abstract: This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART -- a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective. mBART is one of the first methods for pre-training a complete sequence-to-sequence model by denoising full texts in multiple languages, while previous approaches have focused only on the encoder, decoder, or reconstructing parts of the text. Pre-training a complete model allows it to be directly fine tuned for supervised (both sentence-level and document-level) and unsupervised machine translation, with no task-specific modifications. We demonstrate that adding mBART initialization produces performance gains in all but the highest-resource settings, including up to 12 BLEU points for low resource MT and over 5 BLEU points for many document-level and unsupervised models. We also show it also enables new types of transfer to language pairs with no bi-text or that were not in the pre-training corpus, and present extensive analysis of which factors contribute the most to effective pre-training.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold

Mohamed Yousef, Tom E. Bishop

Keywords Paper

text recognition, weakly supervised, handwriting recognition, convolutional neural network fully convolutional, ctc

0

0

0

0

1:00

04/07/2020

Unsupervised Word Translation with Adversarial Autoencoder

Tasnim Mohiuddin, Shafiq Joty

Keywords Paper

Unsupervised Translation, machine translation, transfer learning, word task

0

0

0

0

14:56

19/08/2021

Exemplification Modeling: Can You Give Me an Example, Please?

Edoardo Barba, Luigi Procopio, Caterina Lacerra and
Tommaso Pasini, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:47

04/07/2020

Injecting Numerical Reasoning Skills into Language Models

Mor Geva, Ankit Gupta, Jonathan Berant

Keywords Paper

numerical reasoning, automatic generation, RC tasks, automatic augmentation

0

0

0

0

11:21

08/12/2020

SentiX: A Sentiment-Aware Pre-Trained Model for Cross-Domain Sentiment Analysis

Jie Zhou, Junfeng Tian, Rui Wang and
Yuanbin Wu, Wenming Xiao, Liang He

Keywords Paper

0

0

0

0

12:42

04/07/2020

A Simple and Effective Unified Encoder for Document-Level Machine Translation

Shuming Ma, Dongdong Zhang, Ming Zhou

Keywords Paper

Document-Level Translation, Unified Encoder, encoders, pre-training models

0

0

0

0

7:04

06/12/2020

Language Models are Few-Shot Learners

Tom B Brown, Ben Mann, Nick Ryder and
Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen M Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei

Keywords Paper

0

0

0

0

3:11

06/12/2020

Cross-lingual Retrieval for Iterative Self-Supervised Training

Chau Tran, Yuqing Tang, Xian Li, Jiatao Gu

Keywords Paper

0

0

0

0

3:11

18/07/2021

EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture

Chenfeng Miao, Liang Shuang, Zhengchen Liu and
Chen Minchuan, Jun Ma, Shaojun Wang, Jing Xiao

Keywords Paper

Applications, Audio and Speech Processing

0

0

0

0

5:13

22/06/2020

How Context Affects Language Models' Factual Predictions

Fabio Petroni, Patrick Lewis, Aleksandra Piktus and
Tim Rocktäschel, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

Keywords Paper

0

0

0

0

10:16

16/11/2020

Simulated multiple reference training improves low-resource machine translation

Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn

Keywords Paper

machine mt, mt, simulated training, simulated

0

0

0

0

6:56

16/11/2020

Pre-tokenization of Multi-word Expressions in Cross-lingual Word Embeddings

Naoki Otani, Satoru Ozaki, Xingyuan Zhao and
Yucen Li, Micaelah St Johns, Lori Levin

Keywords Paper

word-translation task, mwe translation, single-word translations, cross-lingual algorithms

0

0

0

0

11:56

04/07/2020

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Mandar Joshi, Danqi Chen, Yinhan Liu and
Daniel S. Weld, Luke Zettlemoyer, Omer Levy

Keywords Paper

span tasks, question answering, coreference resolution, OntoNotes task

0

0

0

0

14:14

06/12/2021

Multimodal and Multilingual Embeddings for Large-Scale Speech Mining

Paul-Ambroise Duquenne, Hongyu Gong, Holger Schwenk

Keywords Paper

0

0

0

0

10:52

06/12/2020

Incorporating BERT into Parallel Sequence Decoding with Adapters

Junliang Guo, Zhirui Zhang, Linli Xu and
Hao-Ran Wei, Boxing Chen, Enhong Chen

Keywords Paper

0

0

0

0

3:17

16/11/2020

Language Model Prior for Low-Resource Neural Machine Translation

Christos Baziotis, Barry Haddow, Alexandra Birch

Keywords Paper

neural translation, neural tm, knowledge distillation, training time

0

0

0

0

11:16

16/11/2020

Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing

Brian Thompson, Matt Post

Keywords Paper

machine evaluation, zero-shot task, wmt task, quality estimation

0

0

0

0

11:02

08/12/2020

Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity

Hamza Harkous, Isabel Groves, Amir Saffari

Keywords Paper

0

0

0

0

14:37

26/04/2020

Neural Machine Translation with Universal Visual Representation

Zhuosheng Zhang, Kehai Chen, Rui Wang and
Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Keywords Paper

Neural Machine Translation, Visual Representation, Multimodal Machine Translation, Language Representation

0

0

0

0

4:50

06/12/2020

Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Jaehyeon Kim, Sungwon Kim, Jungil Kong, Sungroh Yoon

Keywords Paper

0

0

0

0

3:11

16/11/2020

POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training

Yizhe Zhang, Guoyin Wang, Chunyuan Li and
Zhe Gan, Chris Brockett, Bill Dolan

Keywords Paper

language learning, free-form generation, hard-constrained generation, hard-constrained tasks

0

0

0

0

10:09

04/07/2020

Emerging Cross-lingual Structure in Pretrained Language Models

Alexis Conneau, Shijie Wu, Haoran Li and
Luke Zettlemoyer, Veselin Stoyanov

Keywords Paper

multilingual modeling, cross-lingual transfer, transfer, Cross-lingual Models

0

0

0

0

11:49

16/11/2020

Improving AMR Parsing with Sequence-to-Sequence Pre-training

Dongqin Xu, Junhui Li, Muhua Zhu and
Min Zhang, Guodong Zhou

Keywords Paper

abstract parsing, amr parsing, sequence-to-sequence parsing, machine translation

0

0

0

0

11:42

04/07/2020

Curriculum Learning for Natural Language Understanding

Benfeng Xu, Licheng Zhang, Zhendong Mao and
Quan Wang, Hongtao Xie, Yongdong Zhang

Keywords Paper

Curriculum Learning, Natural Understanding, natural tasks, NLU tasks

0

0

0

0

9:41

04/07/2020

Using Context in Neural Machine Translation Training Objectives

Danielle Saunders, Felix Stahlberg, Bill Byrne

Keywords Paper

Neural training, NMT training, document-level training, NMT objective

0

0

0

0

6:48

19/08/2021

Cross-Domain Slot Filling as Machine Reading Comprehension

Mengshi Yu, Jian Liu, Yufeng Chen and
Jinan Xu, Yujie Zhang

Keywords Paper

Natural Language Processing, Dialogue, Information Extraction

0

0

0

0

11:09

04/07/2020

Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning

Alexandre Tamborrino, Nicola Pellicanò, Baptiste Pannier and
Pascal Voitot, Louise Naudin

Keywords Paper

Commonsense Reasoning, common tasks, plausibility task, pre-training phase

0

0

0

0

11:39

04/07/2020

Enhancing Machine Translation with Dependency-Aware Self-Attention

Emanuele Bugliarello, Naoaki Okazaki

Keywords Paper

Machine Translation, neural models, attention mechanism, Transformer model

0

0

0

0

6:58

04/07/2020

Unsupervised Cross-lingual Representation Learning at Scale

Alexis Conneau, Kartikay Khandelwal, Naman Goyal and
Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov

Keywords Paper

cross-lingual tasks, XNLI, MLQA, NER

0

0

0

0

12:15

16/11/2020

Partially-Aligned Data-to-Text Generation with Distant Supervision

Zihao Fu, Bei Shi, Wai Lam and
Lidong Bing, Zhiyuan Liu

Keywords Paper

data-to-text task, generation task, dataset problem, over-generation problem

0

0

0

0

11:58

26/04/2020

VL-BERT: Pre-training of Generic Visual-Linguistic Representations

Weijie Su, Xizhou Zhu, Yue Cao and
Bin Li, Lewei Lu, Furu Wei, Jifeng Dai

Keywords Paper

Visual-Linguistic, Generic Representation, Pre-training

0

0

0

0

4:40

02/02/2021

Encoding Syntactic Knowledge in Transformer Encoder for Intent Detection and Slot Filling

Jixuan Wang, Kai Wei, Martin Radfar and
Weiwei Zhang, Clement Chung

Keywords Paper

0

0

0

0

19:31

16/11/2020

Plug and Play Autoencoders for Conditional Text Generation

Florian Mai, Nikolaos Pappas, Ivan Montero and
Noah A. Smith, James Henderson

Keywords Paper

conditional tasks, style transfer, style tasks, text autoencoders

0

0

0

0

9:23

19/04/2021

Dictionary-based debiasing of pre-trained word embeddings

Masahiro Kaneko, Danushka Bollegala

Keywords Paper

0

0

0

0

8:20

15/06/2020

Scalable validation of binary lifters

Sandeep Dasgupta, Sushant Dinesh, Deepan Venkatesh and
Vikram S. Adve, Christopher W. Fletcher

Keywords Paper

Compiler Optimizations, Formal Semantics, Translation Validation, Graph Isomorphism, x86-64, LLVM IR

0

0

0

0

15:16

03/05/2021

Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

Rafael Valle, Kevin J Shih, Ryan Prenger, Bryan Catanzaro

Keywords Paper

normalizing flows, deep learning, Text to speech synthesis

0

0

0

0

5:11

01/07/2020

Expand and Filter: CUNI and LMU Systems for the WNGT 2020 Duolingo Shared Task

Jindřich Libovický, Zdeněk Kasner, Jindřich Helcl, Ondřej Dušek

Keywords Paper

0

0

0

0

4:59

16/11/2020

Automatic Extraction of Rules Governing Morphological Agreement

Aditi Chaudhary, Antonios Anastasopoulos, Adithya Pratapa and
David R. Mortensen, Zaid Sheikh, Yulia Tsvetkov, Graham Neubig

Keywords Paper

language preservation, cross-lingual transfer, descriptive language, first-pass specification

0

0

0

0

11:48

22/11/2021

From Seq2Seq Recognition to Handwritten Word Embeddings

George Retsinas, Giorgos Sfikas, Christophoros Nikou, Petros Maragos

Keywords Paper

keyword spotting, handwritten text recognition, sequence-to-sequence

0

0

0

0

2:59

12/07/2020

Improving Transformer Optimization Through Better Initialization

Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

14:52