CLIReval: Evaluating Machine Translation as a Cross-Lingual Information Retrieval Task

04/07/2020

CLIReval: Evaluating Machine Translation as a Cross-Lingual Information Retrieval Task

Shuo Sun, Suzanna Sia, Kevin Duh

Keywords: Machine Translation, Cross-Lingual Task, machine MT, MT

Abstract Paper Similar Papers

Abstract: We present CLIReval, an easy-to-use toolkit for evaluating machine translation (MT) with the proxy task of cross-lingual information retrieval (CLIR). Contrary to what the project name might suggest, CLIReval does not actually require any annotated CLIR dataset. Instead, it automatically transforms translations and references used in MT evaluations into a synthetic CLIR dataset; it then sets up a standard search engine (Elasticsearch) and computes various information retrieval metrics (e.g., mean average precision) by treating the translations as documents to be retrieved. The idea is to gauge the quality of MT by its impact on the document translation approach to CLIR. As a case study, we run CLIReval on the "metrics shared task" of WMT2019; while this extrinsic metric is not intended to replace popular intrinsic metrics such as BLEU, results suggest CLIReval is competitive in many language pairs in terms of correlation to human judgments of quality. CLIReval is publicly available at https://github.com/ssun32/CLIReval.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou and
Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Paper

language-agnostic retrieval, cross-lingual tasks, cross-lingual retrieval, alignment

0

0

0

0

12:07

19/04/2021

Towards a decomposable metric for explainable evaluation of text generation from AMR

Juri Opitz, Anette Frank

Keywords Paper

0

0

0

0

11:02

04/07/2020

A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

Raúl Vázquez, Alessandro Raganato, Mathias Creutz, Jörg Tiedemann

Keywords Paper

Multilingual Translation, Neural translation, transfer learning, translation

0

0

0

0

14:05

04/07/2020

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Arman Cohan, Sergey Feldman, Iz Beltagy and
Doug Downey, Daniel Weld

Keywords Paper

Document-level Learning, Representation learning, natural systems, classification

0

0

0

0

13:07

19/08/2021

Keep the Structure: A Latent Shift-Reduce Parser for Semantic Parsing

Yuntao Li, Bei Chen, Qian Liu and
Yan Gao, Jian-Guang Lou, Yan Zhang, Dongmei Zhang

Keywords Paper

Natural Language Processing, Natural Language Semantics

0

0

0

0

12:37

06/12/2020

Unsupervised Text Generation by Learning from Search

Jingjing Li, Zichao Li, Lili Mou and
Xin Jiang, Michael Lyu, Irwin King

Keywords Paper

0

0

0

0

3:24

04/07/2020

Multilingual Universal Sentence Encoder for Semantic Retrieval

Yinfei Yang, Daniel Cer, Amin Ahmad and
Mandy Guo, Jax Law, Noah Constant, Gustavo Hernandez Abrego, Steve Yuan, Chris Tar, Yun-hsuan Sung, Brian Strope, Ray Kurzweil

Keywords Paper

Semantic Retrieval, translation tasks, monolingual retrieval, translation retrieval

0

0

0

0

12:02

04/07/2020

Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation

Bo Pang, Erik Nijkamp, Wenjuan Han and
Linqi Zhou, Yixian Liu, Kewei Tu

Keywords Paper

Holistic Generation, Open-domain generation, Natural Processing, human evaluation

0

0

0

1

11:57

04/07/2020

Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus

Hao Fei, Meishan Zhang, Donghong Ji

Keywords Paper

Cross-Lingual Labeling, semantic labeling, natural understanding, model transferring

0

0

0

0

10:32

16/11/2020

Simulated multiple reference training improves low-resource machine translation

Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn

Keywords Paper

machine mt, mt, simulated training, simulated

0

0

0

0

6:56

25/07/2020

Combining contextualized and non-contextualized query translations to improve CLIR

Suraj Nair, Petra Galuscakova, Douglas W. Oard

Keywords Paper

CLIR, machine translation

0

0

0

0

8:39

04/07/2020

Better Document-level Machine Translation with Bayes' Rule

Lei Yu, Laurent Sartran, Wojciech Stokowiec and
Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

Keywords Paper

Document-level Translation, inference, Bayes Rule, document models

0

0

0

0

10:57

08/12/2020

AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations

Lifeng Han, Gareth Jones, Alan Smeaton

Keywords Paper

0

0

0

0

14:26

16/11/2020

With More Contexts Comes Better Performance: Contextualized Sense Embeddings for All-Round Word Sense Disambiguation

Bianca Scarlini, Tommaso Pasini, Roberto Navigli

Keywords Paper

natural processing, english task, word-in-context task, contextualized embeddings

0

0

0

0

12:11

16/11/2020

Partially-Aligned Data-to-Text Generation with Distant Supervision

Zihao Fu, Bei Shi, Wai Lam and
Lidong Bing, Zhiyuan Liu

Keywords Paper

data-to-text task, generation task, dataset problem, over-generation problem

0

0

0

0

11:58

19/04/2021

Coordinate constructions in English enhanced Universal Dependencies: Analysis and computational modeling

Stefan Grünewald, Prisca Piccirilli, Annemarie Friedrich

Keywords Paper

0

0

0

0

12:44

08/12/2020

Informative Manual Evaluation of Machine Translation Output

Maja Popović

Keywords Paper

0

0

0

0

15:26

03/05/2021

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

0

0

0

0

3:51

03/05/2021

Structured Prediction as Translation between Augmented Natural Languages

Giovanni Paolini, Ben Athiwaratkun, Jason Krone and
Jie Ma, Alessandro Achille, RISHITA ANUBHAI, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto

Keywords Paper

sequence to sequence, structured prediction, language models, transfer learning, few-shot learning, multi-task learning, generative modeling

0

0

0

0

12:16

16/11/2020

Small but Mighty: New Benchmarks for Split and Rephrase

Li Zhang, Huaiyu Zhu, Siddhartha Brahma, Yunyao Li

Keywords Paper

text task, fine-grained evaluation, automatic process, rule-based model

0

0

0

0

6:58

04/07/2020

Logical Natural Language Generation from Open-Domain Tables

Wenhu Chen, Jianshu Chen, Yu Su and
Zhiyu Chen, William Yang Wang

Keywords Paper

Logical Generation, neural NLG, surface-level realizations, logical inference

0

0

0

0

11:48

03/05/2021

Filtered Inner Product Projection for Crosslingual Embedding Alignment

Vin Sachidananda, Ziyi Yang, Chenguang Zhu

Keywords Paper

multilingual representations, natural language processing, word embeddings

0

0

0

0

5:22

14/06/2020

Image Search With Text Feedback by Visiolinguistic Attention Learning

Yanbei Chen, Shaogang Gong, Loris Bazzani

Keywords Paper

vision and language, image search, text feedback, attention mechanism, transformer, multimodal learning, representation learning, composition, image retrieval, interactive image search

0

0

0

0

1:00

26/04/2020

Neural Machine Translation with Universal Visual Representation

Zhuosheng Zhang, Kehai Chen, Rui Wang and
Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Keywords Paper

Neural Machine Translation, Visual Representation, Multimodal Machine Translation, Language Representation

0

0

0

0

4:50

05/12/2020

Mixed-lingual pre-training for cross-lingual summarization

Ruochen Xu, Chenguang Zhu, Yu Shi and
Michael Zeng, Xuedong Huang

Keywords Paper

0

0

0

0

11:49

06/12/2020

Heavy-tailed Representations, Text Polarity Classification & Data Augmentation

Hamid Jalalzai, Pierre Colombo, Chloé Clavel and
Eric Gaussier, Giovanna Varni, Emmanuel Vignon, Anne Sabourin

Keywords Paper

0

0

0

0

2:57

04/07/2020

Using Context in Neural Machine Translation Training Objectives

Danielle Saunders, Felix Stahlberg, Bill Byrne

Keywords Paper

Neural training, NMT training, document-level training, NMT objective

0

0

0

0

6:48

14/06/2020

Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning

Byungsoo Ko, Geonmo Gu

Keywords Paper

metric learning, image retrieval, image clustering, augmentation, sample generation, hard sample mining, pair-based loss, triplet loss, n-pair loss, multi-similarity loss

0

0

0

0

1:00

19/08/2021

Boundary Knowledge Translation based Reference Semantic Segmentation

Lechao Cheng, Zunlei Feng, Xinchao Wang and
Ya Jie Liu, Jie Lei, Mingli Song

Keywords Paper

Computer Vision, 2D and 3D Computer Vision, Deep Learning, Transfer, Adaptation, Multi-task Learning

0

0

0

0

14:22

16/11/2020

XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation

Yaobo Liang, Nan Duan, Yeyun Gong and
Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou

Keywords Paper

large-scale models, cross-lingual tasks, natural tasks, cross-lingual pre-training

0

0

0

0

10:06

14/06/2020

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang

Keywords Paper

data augmentation, text recognition, joint training

0

0

0

0

0:59

15/11/2020

Structure Interpretation of Text Formats

Sumit Gulwani, Vu Le, Arjun Radhakrishna and
Ivan Radiček, Mohammad Raza

Keywords Paper

format diversity, program synthesis, data extraction

0

0

0

0

15:16

19/08/2021

Exemplification Modeling: Can You Give Me an Example, Please?

Edoardo Barba, Luigi Procopio, Caterina Lacerra and
Tommaso Pasini, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:47

16/11/2020

Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains

Weijie Yu, Chen Xu, Jun Xu and
Liang Pang, Xiaopeng Gao, Xiaozhao Wang, Ji-Rong Wen

Keywords Paper

real-world practices, text matching, matching models, match method

0

0

0

0

11:43

03/05/2021

Distilling Knowledge from Reader to Retriever for Question Answering

Gautier Izacard, Edouard Grave

Keywords Paper

question answering, information retrieval

0

0

0

0

5:14

04/07/2020

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model

Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura

Keywords Paper

Automatic Evaluation, machine translation, Cross-lingual Model, regression model

0

0

0

0

7:17

03/05/2021

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Yanru Qu, Dinghan Shen, Yelong Shen and
Sandra Sajeev, Weizhu Chen, Jiawei Han

Keywords Paper

consistency training, contrastive learning, data augmentation, natural language understanding

0

0

0

0

6:02

08/12/2020

Domain Transfer based Data Augmentation for Neural Query Translation

Liang Yao, Baosong Yang, Haibo Zhang and
Boxing Chen, Weihua Luo

Keywords Paper

0

0

0

0

10:57

04/07/2020

Self-Attention with Cross-Lingual Position Representation

Liang Ding, Longyue Wang, Dacheng Tao

Keywords Paper

natural tasks, WMT'17 tasks, Cross-Lingual Representation, Position encoding

0

0

0

0

7:46

16/11/2020

Template Guided Text Generation for Task-Oriented Dialogue

Mihir Kale, Abhinav Rastogi

Keywords Paper

natural generation, generation, nlg, domain-independent model

0

0

0

0

12:25