Multilingual Transfer Learning for QA using Translation as Data Augmentation

02/02/2021

Multilingual Transfer Learning for QA using Translation as Data Augmentation

Mihaela Bornea, Lin Pan, Sara Rosenthal, Radu Florian, Avirup Sil

Keywords:

Abstract Paper Similar Papers

Abstract: Prior work on multilingual question answering has mostly focused on using large multilingual pre-trained language models (LM) to perform zero-shot language-wise learning: train a QA model on English and test on other languages. In this work, we explore strategies that improve cross-lingual transfer by bringing the multilingual embeddings closer in the semantic space. Our first strategy augments the original English training data with machine translation-generated data. This results in a corpus of multilingual silver-labeled QA pairs that is 14 times larger than the original training set. In addition, we propose two novel strategies, language adversarial training and language arbitration framework, which significantly improve the (zero-resource) cross-lingual transfer performance and result in LM embeddings that are less language-variant. Empirically, we show that the proposed models outperform the previous zero-shot baseline on the recently introduced multilingual MLQA and TyDiQA datasets.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38949320

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

01/07/2020

Expand and Filter: CUNI and LMU Systems for the WNGT 2020 Duolingo Shared Task

Jindřich Libovický, Zdeněk Kasner, Jindřich Helcl, Ondřej Dušek

Keywords Paper

0

0

0

0

4:59

16/11/2020

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou and
Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Paper

language-agnostic retrieval, cross-lingual tasks, cross-lingual retrieval, alignment

0

0

0

0

12:07

16/11/2020

Zero-Shot Cross-Lingual Transfer with Meta Learning

Farhad Nooralahzadeh, Giannis Bekoulis, Johannes Bjerva, Isabelle Augenstein

Keywords Paper

strategic knowledge, downstream task, multilingual applications, natural tasks

0

0

0

0

11:42

16/11/2020

Zero-Shot Crosslingual Sentence Simplification

Jonathan Mallinson, Rico Sennrich, Mirella Lapata

Keywords Paper

sentence simplification, translation, simplification, encoder-decoder models

0

0

0

0

10:34

04/07/2020

Unsupervised Cross-lingual Representation Learning at Scale

Alexis Conneau, Kartikay Khandelwal, Naman Goyal and
Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov

Keywords Paper

cross-lingual tasks, XNLI, MLQA, NER

0

0

0

0

12:15

03/05/2021

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

0

0

0

0

3:51

04/07/2020

Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences

Xiangyu Duan, Baijun Ji, Hao Jia and
Min Tan, Min Zhang, Boxing Chen, Weihua Luo, Yue Zhang

Keywords Paper

Bilingual Translation, machine MT, MT, dictionary-based translation

0

0

0

0

14:08

19/04/2021

El volumen louder por favor: Code-switching in task-oriented semantic parsing

Arash Einolghozati, Abhinav Arora, Lorena Sainz-Maza Lecanda and
Anuj Kumar, Sonal Gupta

Keywords Paper

0

0

0

0

11:39

19/04/2021

PPT: Parsimonious parser transfer for unsupervised cross-lingual adaptation

Kemal Kurniawan, Lea Frermann, Philip Schulz, Trevor Cohn

Keywords Paper

0

0

0

0

11:52

05/12/2020

Mixed-lingual pre-training for cross-lingual summarization

Ruochen Xu, Chenguang Zhu, Yu Shi and
Michael Zeng, Xuedong Huang

Keywords Paper

0

0

0

0

11:49

16/11/2020

Simulated multiple reference training improves low-resource machine translation

Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn

Keywords Paper

machine mt, mt, simulated training, simulated

0

0

0

0

6:56

16/11/2020

Multi-task Learning for Multilingual Neural Machine Translation

Yiren Wang, ChengXiang Zhai, Hany Hassan

Keywords Paper

bilingual nmt, bilingual, multilingual systems, translation task

0

0

0

0

10:48

16/11/2020

Language Model Prior for Low-Resource Neural Machine Translation

Christos Baziotis, Barry Haddow, Alexandra Birch

Keywords Paper

neural translation, neural tm, knowledge distillation, training time

0

0

0

0

11:16

16/11/2020

DAGA: Data Augmentation with a Generation Approach forLow-resource Tagging Tasks

Bosheng Ding, Linlin Liu, Lidong Bing and
Canasai Kruengkrai, Thien Hai Nguyen, Shafiq Joty, Luo Si, Chunyan Miao

Keywords Paper

machine learning, generalization, low-resource tasks, named recognition

0

0

0

0

11:09

16/11/2020

LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon Induction Through Non-Linear Mapping in Latent Space

Tasnim Mohiuddin, M Saiful Bari, Shafiq Joty

Keywords Paper

bilingual induction, bilingual, bli, semi-supervised method

0

0

0

0

12:09

16/11/2020

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

Weijia Xu, Batool Haider, Saab Mansour

Keywords Paper

natural understanding, natural, nlu, goal-oriented systems

0

0

0

0

9:46

04/07/2020

Hypernymy Detection for Low-Resource Languages via Meta Learning

Changlong Yu, Jialong Han, Haisong Zhang, Wilfred Ng

Keywords Paper

Hypernymy Detection, lexical entailment, natural tasks, monolingual detection

0

0

0

0

6:53

04/07/2020

Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

Fei Yuan, Linjun Shou, Xuanyu Bai and
Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang

Keywords Paper

Multilingual Comprehension, multilingual MRC, MRC, sentence tasks

0

0

0

0

8:30

04/07/2020

Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization

Yue Cao, Hui Liu, Xiaojun Wan

Keywords Paper

Neural Summarization, Cross-lingual summarization, cross-lingual training, pipeline methods

0

0

0

0

9:30

06/12/2020

Cross-lingual Retrieval for Iterative Self-Supervised Training

Chau Tran, Yuqing Tang, Xian Li, Jiatao Gu

Keywords Paper

0

0

0

0

3:11

18/07/2021

Self-supervised and Supervised Joint Training for Resource-rich Machine Translation

Yong Cheng, Wei Wang, Lu Jiang, Wolfgang Macherey

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

5:21

04/07/2020

On the Cross-lingual Transferability of Monolingual Representations

Mikel Artetxe, Sebastian Ruder, Dani Yogatama

Keywords Paper

zero-shot setting, Cross-lingual Representations, unsupervised models, joint training

0

0

0

0

11:28

16/11/2020

Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models

Isabel Papadimitriou, Dan Jurafsky

Keywords Paper

analyzing structure, encoding structure, natural acquisition, transfer learning

0

0

0

0

11:44

16/11/2020

Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss”

Michele Bevilacqua, Marco Maru, Roberto Navigli

Keywords Paper

generative modeling, definition modeling, discriminative tasks, word disambiguation

0

0

0

0

11:49

04/07/2020

Unsupervised Word Translation with Adversarial Autoencoder

Tasnim Mohiuddin, Shafiq Joty

Keywords Paper

Unsupervised Translation, machine translation, transfer learning, word task

0

0

0

0

14:56

26/04/2020

Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework

Zirui Wang, Jiateng Xie, Ruochen Xu and
Yiming Yang, Graham Neubig, Jaime G. Carbonell

Keywords Paper

Cross-lingual Representation

0

0

0

0

4:53

02/02/2021

XL-WSD: An Extra-Large and Cross-Lingual Evaluation Framework for Word Sense Disambiguation

Tommaso Pasini, Alessandro Raganato, Roberto Navigli

Keywords Paper

0

0

0

0

19:06

16/11/2020

X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset

Angel Daza, Anette Frank

Keywords Paper

generalization learning, multilingual learning, high-quality translation, srl

0

0

0

0

9:24

04/07/2020

Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation

Aditya Siddhant, Ankur Bapna, Yuan Cao and
Orhan Firat, Mia Chen, Sneha Kudugunta, Naveen Arivazhagan, Yonghui Wu

Keywords Paper

Multilingual Translation, Multilingual , low-resource translation, low-resource NMT

1

1

0

0

6:51

04/07/2020

A Simple and Effective Unified Encoder for Document-Level Machine Translation

Shuming Ma, Dongdong Zhang, Ming Zhou

Keywords Paper

Document-Level Translation, Unified Encoder, encoders, pre-training models

0

0

0

0

7:04

16/11/2020

X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models

Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki and
Haibo Ding, Graham Neubig

Keywords Paper

factual retrieval, language models, lms, probing methods

0

0

0

0

9:45

08/12/2020

SentiX: A Sentiment-Aware Pre-Trained Model for Cross-Domain Sentiment Analysis

Jie Zhou, Junfeng Tian, Rui Wang and
Yuanbin Wu, Wenming Xiao, Liang He

Keywords Paper

0

0

0

0

12:42

19/04/2021

Alignment verification to improve NMT translation towards highly inflectional languages with limited resources

George Tambouratzis

Keywords Paper

0

0

0

0

12:02

04/07/2020

Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation

Biao Zhang, Philip Williams, Ivan Titov, Rico Sennrich

Keywords Paper

Massively Translation, Zero-Shot Translation, neural translation, NMT

0

0

0

0

11:47

08/12/2020

Automatic Learning of Modality Exclusivity Norms with Crosslingual Word Embeddings

Emmanuele Chersoni, Rong Xiang, Qin Lu, Chu-Ren Huang

Keywords Paper

0

0

0

0

9:53

16/11/2020

MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale

Andreas Rücklé, Jonas Pfeiffer, Iryna Gurevych

Keywords Paper

answer tasks, zero-shot transfer, text models, self-supervised training

0

0

0

0

10:07

16/11/2020

Visually Grounded Compound PCFGs

Yanpeng Zhao, Ivan Titov

Keywords Paper

exploiting groundings, language understanding, gradient estimates, fully-differentiable learning

0

0

0

0

12:24

04/07/2020

Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya

Abrhalei Frezghi Tela, Abraham Woubie Zewoudie, Ville Hautamäki

Keywords Paper

natural tasks, NLP, downstream task, pre-training

0

0

0

0

8:47

04/07/2020

Semantic Parsing for English as a Second Language

Yuanyuan Zhao, Weiwei Sun, Junjie Cao, Xiaojun Wan

Keywords Paper

semantic parsing, second acquisition, Semantic Parsing, ESL

0

0

0

0

11:04

26/04/2020

Generalization through Memorization: Nearest Neighbor Language Models

Urvashi Khandelwal, Omer Levy, Dan Jurafsky and
Luke Zettlemoyer, Mike Lewis

Keywords Paper

language models, k-nearest neighbors

0

0

0

0

4:56