LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

16/11/2020

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou, Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords: language-agnostic retrieval, cross-lingual tasks, cross-lingual retrieval, alignment

Abstract Paper Similar Papers

Abstract: We present LAReQA, a challenging new benchmark for language-agnostic answer retrieval from a multilingual candidate pool. Unlike previous cross-lingual tasks, LAReQA tests for ``strong″ cross-lingual alignment, requiring semantically related \textitcross-language pairs to be closer in representation space than unrelated \textitsame-language pairs. This level of alignment is important for the practical task of cross-lingual information retrieval. Building on multilingual BERT (mBERT), we study different strategies for achieving strong alignment. We find that augmenting training data via machine translation is effective, and improves significantly over using mBERT out-of-the-box. Interestingly, model performance on zero-shot variants of our task that only target ``weak″ alignment is not predictive of performance on LAReQA. This finding underscores our claim that language-agnostic retrieval is a substantively new kind of cross-lingual evaluation, and suggests that measuring both weak and strong alignment will be important for improving cross-lingual systems going forward. We release our dataset and evaluation code at r̆lhttps://github.com/google-research-datasets/lareqa.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

08/12/2020

ContraCAT: Contrastive Coreference Analytical Templates for Machine Translation

Dario Stojanovski, Benno Krojer, Denis Peskov, Alexander Fraser

Keywords Paper

0

0

0

0

14:09

16/11/2020

Partially-Aligned Data-to-Text Generation with Distant Supervision

Zihao Fu, Bei Shi, Wai Lam and
Lidong Bing, Zhiyuan Liu

Keywords Paper

data-to-text task, generation task, dataset problem, over-generation problem

0

0

0

0

11:58

04/07/2020

Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting

Po-Yao Huang, Junjie Hu, Xiaojun Chang, Alexander Hauptmann

Keywords Paper

Unsupervised Translation, Unsupervised MT, MT, alignment

0

0

0

0

12:17

03/05/2021

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

0

0

0

0

3:51

04/07/2020

Better Document-level Machine Translation with Bayes' Rule

Lei Yu, Laurent Sartran, Wojciech Stokowiec and
Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

Keywords Paper

Document-level Translation, inference, Bayes Rule, document models

0

0

0

0

10:57

16/11/2020

Simulated multiple reference training improves low-resource machine translation

Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn

Keywords Paper

machine mt, mt, simulated training, simulated

0

0

0

0

6:56

04/07/2020

Unsupervised Word Translation with Adversarial Autoencoder

Tasnim Mohiuddin, Shafiq Joty

Keywords Paper

Unsupervised Translation, machine translation, transfer learning, word task

0

0

0

0

14:56

16/11/2020

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

Weijia Xu, Batool Haider, Saab Mansour

Keywords Paper

natural understanding, natural, nlu, goal-oriented systems

0

0

0

0

9:46

16/11/2020

X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models

Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki and
Haibo Ding, Graham Neubig

Keywords Paper

factual retrieval, language models, lms, probing methods

0

0

0

0

9:45

08/12/2020

A Deep Metric Learning Method for Biomedical Passage Retrieval

Andrés Rosso-Mateus, Fabio A. González, Manuel Montes-y-Gómez

Keywords Paper

0

0

0

0

14:58

16/11/2020

Visually Grounded Compound PCFGs

Yanpeng Zhao, Ivan Titov

Keywords Paper

exploiting groundings, language understanding, gradient estimates, fully-differentiable learning

0

0

0

0

12:24

16/11/2020

A Bilingual Generative Transformer for Semantic Sentence Embedding

John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick

Keywords Paper

source separation, semantic encoding, data distributions, unsupervised evaluations

0

0

0

0

14:32

26/04/2020

Neural Machine Translation with Universal Visual Representation

Zhuosheng Zhang, Kehai Chen, Rui Wang and
Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Keywords Paper

Neural Machine Translation, Visual Representation, Multimodal Machine Translation, Language Representation

0

0

0

0

4:50

05/12/2020

An exploratory study on multilingual quality estimation

Shuo Sun, Marina Fomicheva, Frédéric Blain and
Vishrav Chaudhary, Ahmed El-Kishky, Adithya Renduchintala, Francisco Guzmán, Lucia Specia

Keywords Paper

0

0

0

0

14:31

02/02/2021

Multilingual Transfer Learning for QA using Translation as Data Augmentation

Mihaela Bornea, Lin Pan, Sara Rosenthal and
Radu Florian, Avirup Sil

Keywords Paper

0

0

0

0

15:44

06/12/2020

Unsupervised Text Generation by Learning from Search

Jingjing Li, Zichao Li, Lili Mou and
Xin Jiang, Michael Lyu, Irwin King

Keywords Paper

0

0

0

0

3:24

04/07/2020

Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus

Hao Fei, Meishan Zhang, Donghong Ji

Keywords Paper

Cross-Lingual Labeling, semantic labeling, natural understanding, model transferring

0

0

0

0

10:32

04/07/2020

Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences

Xiangyu Duan, Baijun Ji, Hao Jia and
Min Tan, Min Zhang, Boxing Chen, Weihua Luo, Yue Zhang

Keywords Paper

Bilingual Translation, machine MT, MT, dictionary-based translation

0

0

0

0

14:08

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

16/11/2020

With More Contexts Comes Better Performance: Contextualized Sense Embeddings for All-Round Word Sense Disambiguation

Bianca Scarlini, Tommaso Pasini, Roberto Navigli

Keywords Paper

natural processing, english task, word-in-context task, contextualized embeddings

0

0

0

0

12:11

06/12/2020

Incorporating BERT into Parallel Sequence Decoding with Adapters

Junliang Guo, Zhirui Zhang, Linli Xu and
Hao-Ran Wei, Boxing Chen, Enhong Chen

Keywords Paper

0

0

0

0

3:17

04/07/2020

Self-Attention with Cross-Lingual Position Representation

Liang Ding, Longyue Wang, Dacheng Tao

Keywords Paper

natural tasks, WMT'17 tasks, Cross-Lingual Representation, Position encoding

0

0

0

0

7:46

25/07/2020

Learning discriminative joint embeddings for efficient face and voice association

Rui Wang, Xin Liu, Yiu-ming Cheung and
Kai Cheng, Nannan Wang, Wentao Fan

Keywords Paper

bi-directional ranking constraint, face-voice association, cross-modal verification, discriminative joint embedding

0

0

0

0

8:33

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

04/07/2020

Learning Source Phrase Representations for Neural Machine Translation

Hongfei Xu, Josef van Genabith, Deyi Xiong and
Qiuhui Liu, Jingyi Zhang

Keywords Paper

Neural Translation, WMT tasks, Learning Representations, Transformer model

0

0

0

0

7:18

04/07/2020

Unsupervised Cross-lingual Representation Learning at Scale

Alexis Conneau, Kartikay Khandelwal, Naman Goyal and
Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov

Keywords Paper

cross-lingual tasks, XNLI, MLQA, NER

0

0

0

0

12:15

04/07/2020

On the Linguistic Representational Power of Neural Machine Translation Models

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi and
Hassan Sajjad, James Glass

Keywords Paper

Linguistic Models, natural processing, artificial intelligence, translating languages

0

0

0

0

19:17

01/07/2020

Linguistic Features for Readability Assessment

Tovly Deutsch, Masoud Jasbi, Stuart Shieber

Keywords Paper

0

0

0

0

12:06

30/11/2020

Scale-Aware Polar Representation for Arbitrarily-Shaped Text Detection

Yanguang Bi, Zhiqiang Hu

Keywords Paper

0

0

0

0

9:56

04/07/2020

Towards Robustifying NLI Models Against Lexical Dataset Biases

Xiang Zhou, Mohit Bansal

Keywords Paper

Natural Inference, data augmentation, Robustifying Models, deep models

0

0

0

0

11:34

19/04/2021

Enriching non-autoregressive transformer with syntactic and semantic structures for neural machine translation

Ye Liu, Yao Wan, Jianguo Zhang and
Wenting Zhao, Philip Yu

Keywords Paper

0

0

0

0

10:18

19/08/2021

A Sequence-to-Set Network for Nested Named Entity Recognition

Zeqi Tan, Yongliang Shen, Shuai Zhang and
Weiming Lu, Yueting Zhuang

Keywords Paper

Natural Language Processing, Information Extraction, Named Entities

0

0

0

0

10:38

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

16/11/2020

Language Model Prior for Low-Resource Neural Machine Translation

Christos Baziotis, Barry Haddow, Alexandra Birch

Keywords Paper

neural translation, neural tm, knowledge distillation, training time

0

0

0

0

11:16

16/11/2020

Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference

Jianguo Zhang, Kazuma Hashimoto, Wenhao Liu and
Chien-Sheng Wu, Yao Wan, Philip Yu, Richard Socher, Caiming Xiong

Keywords Paper

intent detection, detecting intents, oos detection, large-scale task

0

0

0

0

11:43

16/11/2020

Uncertainty-Aware Semantic Augmentation for Neural Machine Translation

Xiangpeng Wei, Heng Yu, Yue Hu and
Rongxiang Weng, Luxi Xing, Weihua Luo

Keywords Paper

sequence-to-sequence task, nmt, inference, translation tasks

0

0

0

0

11:11

04/07/2020

Using Context in Neural Machine Translation Training Objectives

Danielle Saunders, Felix Stahlberg, Bill Byrne

Keywords Paper

Neural training, NMT training, document-level training, NMT objective

0

0

0

0

6:48

03/05/2021

Towards Robustness Against Natural Language Word Substitutions

Xinshuai Dong, Anh Tuan Luu, Rongrong Ji, Hong Liu

Keywords Paper

Adversarial Defense, Natural Language Processing

0

0

0

0

6:06

02/02/2021

Have We Solved The Hard Problem? It’s Not Easy! Contextual Lexical Contrast as a Means to Probe Neural Coherence

Wenqiang Lei, Yisong Miao, Runpeng Xie and
Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Paper

0

0

0

0

18:55

02/02/2021

C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

Yutai Hou, Sanyuan Chen, Wanxiang Che and
Cheng Chen, Ting Liu

Keywords Paper

0

0

0

0

15:01