On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation

04/07/2020

On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation

Wei Zhao, Goran Glavaš, Maxime Peyrard, Yang Gao, Robert West, Steffen Eger

Keywords: Evaluation encoders, zero-shot transfer, supervised tasks, web-scale systems

Abstract Paper Similar Papers

Abstract: Evaluation of cross-lingual encoders is usually performed either via zero-shot cross-lingual transfer in supervised downstream tasks or via unsupervised cross-lingual textual similarity. In this paper, we concern ourselves with reference-free machine translation (MT) evaluation where we directly compare source texts to (sometimes low-quality) system translations, which represents a natural adversarial setup for multilingual encoders. Reference-free evaluation holds the promise of web-scale comparison of MT systems. We systematically investigate a range of metrics based on state-of-the-art cross-lingual semantic representations obtained with pretrained M-BERT and LASER. We find that they perform poorly as semantic encoders for reference-free MT evaluation and identify their two key limitations, namely, (a) a semantic mismatch between representations of mutual translations and, more prominently, (b) the inability to punish "translationese", i.e., low-quality literal translations. We propose two partial remedies: (1) post-hoc re-alignment of the vector spaces and (2) coupling of semantic-similarity based metrics with target-side language modeling. In segment-level MT evaluation, our best metric surpasses reference-based BLEU by 5.7 correlation points.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection

Jingjing Li, Wei Ji, Qi Bi and
Cheng Yan, Miao Zhang, Yongri Piao, Huchuan Lu, Li cheng

Keywords Paper

vision

0

0

0

0

9:03

12/07/2020

Aligned Cross Entropy for Non-Autoregressive Machine Translation

Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

14:43

02/02/2021

Non-Autoregressive Coarse-to-Fine Video Captioning

Bang Yang, Yuexian Zou, Fenglin Liu, Can Zhang

Keywords Paper

0

0

0

0

18:21

16/11/2020

From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers

Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš

Keywords Paper

zero-shot transfer, downstream transfer, resource-lean scenarios, pos tagging

0

0

0

0

11:45

02/02/2021

FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding

Yuwei Fang, Shuohang Wang, Zhe Gan and
Siqi Sun, Jingjing Liu

Keywords Paper

0

0

0

0

17:39

08/12/2020

Infusing Sequential Information into Conditional Masked Translation Model with Self-Review Mechanism

Pan Xie, Zhi Cui, Xiuying Chen and
XiaoHui Hu, Jianwei Cui, Bin Wang

Keywords Paper

0

0

0

0

6:43

16/11/2020

If beam search is the answer, what was the question?

Clara Meister, Ryan Cotterell, Tim Vieira

Keywords Paper

language tasks, beam search, decoding, maximum decoding

0

0

0

0

12:18

02/02/2021

Semantic-guided Reinforced Region Embedding for Generalized Zero-Shot Learning

Jiannan Ge, Hongtao Xie, Shaobo Min, Yongdong Zhang

Keywords Paper

0

0

0

0

16:22

06/12/2021

HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning

Shiming Chen, Guosen Xie, Yang Liu and
Qinmu Peng, Baigui Sun, Hao Li, Xinge You, Ling Shao

Keywords Paper

generative model, domain adaptation

0

0

0

0

9:19

16/11/2020

Iterative Domain-Repaired Back-Translation

Hao-Ran Wei, Zhirui Zhang, Boxing Chen, Weihua Luo

Keywords Paper

domain-specific translation, domain adaptation, back-translation method, out-of-domain systems

0

0

0

0

11:35

16/11/2020

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou and
Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Paper

language-agnostic retrieval, cross-lingual tasks, cross-lingual retrieval, alignment

0

0

0

0

12:07

04/07/2020

Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs

Dong Bok Lee, Seanie Lee, Woo Tae Jeong and
Donghwan Kim, Sung Ju Hwang

Keywords Paper

question answering, QA, QA, Information-Maximizing VAEs

0

0

0

0

11:40

04/07/2020

Better Document-level Machine Translation with Bayes' Rule

Lei Yu, Laurent Sartran, Wojciech Stokowiec and
Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

Keywords Paper

Document-level Translation, inference, Bayes Rule, document models

0

0

0

0

10:57

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

26/04/2020

Adversarially Robust Representations with Smooth Encoders

Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy (Dj) Dvijotham, Pushmeet Kohli

Keywords Paper

Adversarial Learning, Robust Representations, Variational AutoEncoder, Wasserstein Distance, Variational Inference

0

0

0

0

5:16

14/06/2020

Learning Meta Face Recognition in Unseen Domains

Jianzhu Guo, Xiangyu Zhu, Chenxu Zhao and
Dong Cao, Zhen Lei, Stan Z. Li

Keywords Paper

face recognition, meta learning, domain generalization, metric learning

0

0

0

0

5:01

22/11/2021

A Closer Look at Few-Shot Video Classification: A New Baseline and Benchmark

Zhenxi Zhu, Limin Wang, Sheng Guo, Gangshan Wu

Keywords Paper

few-shot learning, classifier-based baseline, new benchmark, action recognition

0

0

0

0

2:58

05/01/2021

Improving Video Captioning With Temporal Composition of a Visual-Syntactic Embedding

Jesus Perez-Martin, Benjamin Bustos, Jorge Perez

Keywords Paper

0

0

0

0

5:01

16/11/2020

On the Sentence Embeddings from Pre-trained Language Models

Bohan Li, Hao Zhou, Junxian He and
Mingxuan Wang, Yiming Yang, Lei Li

Keywords Paper

natural processing, semantic task, semantic tasks, pre-trained representations

0

0

0

0

9:11

04/07/2020

Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting

Po-Yao Huang, Junjie Hu, Xiaojun Chang, Alexander Hauptmann

Keywords Paper

Unsupervised Translation, Unsupervised MT, MT, alignment

0

0

0

0

12:17

16/11/2020

Simulated multiple reference training improves low-resource machine translation

Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn

Keywords Paper

machine mt, mt, simulated training, simulated

0

0

0

0

6:56

08/12/2020

Detecting Urgency Status of Crisis Tweets: A Transfer Learning Approach for Low Resource Languages

Efsun Sarioglu Kayi, Linyong Nan, Bohan Qu and
Mona Diab, Kathleen McKeown

Keywords Paper

0

0

0

0

14:37

06/12/2021

Unsupervised Part Discovery from Contrastive Reconstruction

Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi

Keywords Paper

machine learning, self-supervised learning, clustering, representation learning

0

0

0

0

6:46

06/12/2021

Making a (Counterfactual) Difference One Rationale at a Time

Mitchell Plyler, Michael Green, Min Chi

Keywords Paper

theory, generative model, language, interpretability

0

0

0

0

13:57

26/04/2020

Masked Based Unsupervised Content Transfer

Ron Mokady, Sagie Benaim, Lior Wolf, Amit Bermano

Keywords Paper

0

0

0

0

4:38

02/02/2021

DecAug: Out-of-Distribution Generalization via Decomposed Feature Representation and Semantic Augmentation

Haoyue Bai, Rui Sun, Lanqing Hong and
Fengwei Zhou, Nanyang Ye, Han-Jia Ye, S.-H. Gary Chan, Zhenguo Li

Keywords Paper

0

0

0

0

15:59

16/11/2020

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Linyang Li, Ruotian Ma, Qipeng Guo and
Xiangyang Xue, Xipeng Qiu

Keywords Paper

adversarial attacks, downstream tasks, calculation, gradient-based methods

0

0

0

0

11:36

14/06/2020

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Runfa Chen, Wenbing Huang, Binghui Huang and
Fuchun Sun, Bin Fang

Keywords Paper

nice-gan, reusing discriminators for encoding, unsupervised image-to-image translation, decoupled training, multi-scale discriminators, adversarial loss, no independent component for encoding, shared layers, residual attention, cyclegan

0

0

0

0

1:01

02/02/2021

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

Pengfei Wang, Chengquan Zhang, Fei Qi and
Shanshan Liu, Xiaoqiang Zhang, Pengyuan Lyu, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi

Keywords Paper

0

0

0

0

18:06

01/07/2020

Joint Training with Semantic Role Labeling for Better Generalization in Natural Language Inference

Cemil Cengiz, Deniz Yuret

Keywords Paper

0

0

0

0

4:38

26/04/2020

A Probabilistic Formulation of Unsupervised Text Style Transfer

Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick

Keywords Paper

unsupervised text style transfer, deep latent sequence model

0

0

0

0

5:02

14/06/2020

ViewAL: Active Learning With Viewpoint Entropy for Semantic Segmentation

Yawar Siddiqui, Julien Valentin, Matthias Nießner

Keywords Paper

active learning, semantic segmentation, deep learning, view consistency

0

0

0

0

1:01

08/12/2020

Emergent Communication Pretraining for Few-Shot Machine Translation

Yaoyiran Li, Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen

Keywords Paper

0

0

0

0

14:42

02/02/2021

Semantic Grouping Network for Video Captioning

Hobin Ryu, Sunghun Kang, Haeyong Kang, Chang D. Yoo

Keywords Paper

0

0

0

0

17:41

08/12/2020

Understanding the effects of word-level linguistic annotations in under-resourced neural machine translation

Víctor M. Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez

Keywords Paper

0

0

0

0

14:59

26/08/2020

Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces

David Alvarez-Melis, Youssef Mroueh, Tommi Jaakkola

Keywords Paper

0

0

1

1

15:14

06/12/2021

Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation

Qiming Hu, Xiaojie Guo

Keywords Paper

deep learning

0

0

0

0

12:25

06/12/2021

Low-Rank Subspaces in GANs

Jiapeng Zhu, Ruili Feng, Yujun Shen and
Deli Zhao, Zheng-Jun Zha, Jingren Zhou, Qifeng Chen

Keywords Paper

generative model

0

0

0

0

11:41

16/11/2020

Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks

Denis Emelin, Ivan Titov, Rico Sennrich

Keywords Paper

word disambiguation, nmt, prediction errors, adversarial strategy

0

0

0

0

12:57

14/06/2020

Deep Spatial Gradient and Temporal Depth Learning for Face Anti-Spoofing

Zezheng Wang, Zitong Yu, Chenxu Zhao and
Xiangyu Zhu, Yunxiao Qin, Qiusheng Zhou, Feng Zhou, Zhen Lei

Keywords Paper

face anti-spoofing, depth supervised learning, multiple frames, detailed discriminative clues, 3d moving faces

0

0

0

0

4:57