On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation

04/07/2020

On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation

Chaojun Wang, Rico Sennrich

Keywords: Domain Translation, neural translation, NMT, beam problem

Abstract Paper Similar Papers

Abstract: The standard training algorithm in neural machine translation (NMT) suffers from exposure bias, and alternative algorithms have been proposed to mitigate this. However, the practical impact of exposure bias is under debate. In this paper, we link exposure bias to another well-known problem in NMT, namely the tendency to generate hallucinations under domain shift. In experiments on three datasets with multiple test domains, we show that exposure bias is partially to blame for hallucinations, and that training with Minimum Risk Training, which avoids exposure bias, can mitigate this. Our analysis explains why exposure bias is more problematic under domain shift, and also links exposure bias to the beam search problem, i.e. performance deterioration with increasing beam size. Our results provide a new justiﬁcation for methods that reduce exposure bias: even if they do not increase performance on in-domain test sets, they can increase model robustness to domain shift.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

12/07/2020

Adversarial Filters of Dataset Biases

Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula and
Rowan Zellers, Matthew Peters, Ashish Sabharwal, Yejin Choi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

15:25

12/07/2020

Improving Robustness of Deep-Learning-Based Image Reconstruction

Ankit Raj, Yoram Bresler, Bo Li

Keywords Paper

Trustworthy Machine Learning

0

0

0

0

15:12

19/04/2021

Incremental beam manipulation for natural language generation

James Hargreaves, Andreas Vlachos, Guy Emerson

Keywords Paper

0

0

0

0

10:34

19/04/2021

Measuring and improving faithfulness of attention in neural machine translation

Pooya Moradi, Nishant Kambhatla, Anoop Sarkar

Keywords Paper

0

1

1

1

10:55

19/04/2021

On hallucination and predictive uncertainty in conditional language generation

Yijun Xiao, William Yang Wang

Keywords Paper

0

0

0

0

11:37

02/02/2021

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu

Keywords Paper

0

0

0

0

14:30

18/07/2021

Understanding Instance-Level Label Noise: Disparate Impacts and Treatments

Yang Liu

Keywords Paper

Algorithms, Supervised Learning

0

0

0

0

19:55

06/12/2020

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Huan Zhang, Hongge Chen, Chaowei Xiao and
Bo Li, Mingyan Liu, Duane Boning, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

3:18

06/12/2021

Bayesian Adaptation for Covariate Shift

Aurick Zhou, Sergey Levine

Keywords Paper

deep learning, machine learning, robustness, vision, domain adaptation

0

0

0

0

8:21

03/05/2021

Explaining the Efficacy of Counterfactually Augmented Data

Divyansh Kaushik, Amrith Setlur, Eduard H Hovy, Zachary Lipton

Keywords Paper

sentiment analysis, text classification, natural language inference, annotation artifacts, humans in the loop

0

0

0

0

5:11

14/06/2020

Modeling Biological Immunity to Adversarial Examples

Edward Kim, Jocelyn Rego, Yijing Watkins, Garrett T. Kenyon

Keywords Paper

adversarial examples, sparse coding, retina, cortex, neuron, biology, robust, feedback

0

0

0

0

1:01

18/07/2021

Towards Better Robust Generalization with Shift Consistency Regularization

Shufei Zhang, Zhuang Qian, Kaizhu Huang and
Qiufeng Wang, Rui Zhang, Xinping Yi

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

0

5:44

26/04/2020

Understanding the Limitations of Variational Mutual Information Estimators

Jiaming Song, Stefano Ermon

Keywords Paper

0

0

0

0

5:04

26/04/2020

Improving Adversarial Robustness Requires Revisiting Misclassified Examples

Yisen Wang, Difan Zou, Jinfeng Yi and
James Bailey, Xingjun Ma, Quanquan Gu

Keywords Paper

Robustness, Adversarial Defense, Adversarial Training

0

0

0

0

5:02

06/12/2021

Predify: Augmenting deep neural networks with brain-inspired predictive coding dynamics

Bhavin Choksi, Milad Mozafari, Callum Biggs O'May and
B. ADOR, Andrea Alamia, Rufin VanRullen

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security, neuroscience, vision

0

0

0

0

11:21

18/07/2021

DANCE: Enhancing saliency maps using decoys

Yang Lu, Wenbo Guo, Xinyu Xing, William Stafford Noble

Keywords Paper

Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

5:36

02/02/2021

Training Spiking Neural Networks with Accumulated Spiking Flow

Hao Wu, Yueyi Zhang, Wenming Weng and
Yongting Zhang, Zhiwei Xiong, Zheng-Jun Zha, Xiaoyan Sun, Feng Wu

Keywords Paper

0

0

0

0

16:45

07/09/2020

Transferring Pretrained Networks to Small Data via Category Decorrelation

Ying Jin, Zhangjie Cao, Mingsheng Long, Jianmin Wang

Keywords Paper

Category Decorrelation, Under Transfer

1

1

0

0

8:39

02/02/2021

DIBS: Diversity Inducing Information Bottleneck in Model Ensembles

Samarth Sinha, Homanga Bharadhwaj, Anirudh Goyal and
Hugo Larochelle, Animesh Garg, Florian Shkurti

Keywords Paper

0

0

0

0

16:26

04/07/2020

Variational Neural Machine Translation with Normalizing Flows

Hendra Setiawan, Matthias Sperber, Udhyakumar Nallasamy, Matthias Paulik

Keywords Paper

Variational Translation, Variational VNMT, Variational, generation translations

0

0

0

0

7:09

22/11/2021

Towards Dynamic and Scalable Active Learning with Neural Architecture Adaption for Object Detection

Fuhui Tang, ChenHan Jiang, Dafeng Wei and
Hang Xu, Andi Zhang, Wei Zhang, Hongtao Lu, Chunjing Xu

Keywords Paper

active learning, neural architecture adaption, object detection, dirichlet calibration, clustering sampling, network morphism modifications, uncertainty, dimension reduction, sample diversity, swap-expand strategy

0

0

0

0

2:40

26/08/2020

On Minimax Optimality of GANs for Robust Mean Estimation

Kaiwen Wu, Gavin Weiguang Ding, Ruitong Huang, Yaoliang Yu

Keywords Paper

0

0

0

0

12:54

14/06/2020

Regularizing Class-Wise Predictions via Self-Knowledge Distillation

Sukmin Yun, Jongjin Park, Kimin Lee, Jinwoo Shin

Keywords Paper

image classification, regularization, self-knowledge distillation, generalization, calibration

0

0

0

0

1:01

06/12/2020

A new inference approach for training shallow and deep generalized linear models of noisy interacting neurons

Gabriel Mahuas, Giulio Isacchini, Olivier Marre and
Ulisse Ferrari, Thierry Mora

Keywords Paper

0

0

0

0

3:07

14/06/2020

Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution

Yong Guo, Jian Chen, Jingdong Wang and
Qi Chen, Jiezhang Cao, Zeshuai Deng, Yanwu Xu, Mingkui Tan

Keywords Paper

computer vision, image super-resolution, dual regression scheme, closed-loop

0

0

0

0

1:01

18/07/2021

RNNRepair: Automatic RNN Repair via Model-based Analysis

Xiaofei Xie, Wenbo Guo, Lei Ma and
Wei Le, Jian Wang, Lingjun Zhou, Yang Liu, Xinyu Xing

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

5:21

02/02/2021

Adversarial Robustness through Disentangled Representations

Shuo Yang, Tianyu Guo, Yunhe Wang, Chang Xu

Keywords Paper

0

0

0

0

15:00

19/04/2021

Evaluating neural model robustness for machine comprehension

Winston Wu, Dustin Arendt, Svitlana Volkova

Keywords Paper

0

0

0

0

11:41

03/05/2021

Influence Functions in Deep Learning Are Fragile

Samyadeep Basu, Phil Pope, Soheil Feizi

Keywords Paper

Influence Functions, Interpretability

0

0

1

1

6:15

04/07/2020

Evaluating Robustness to Input Perturbations for Neural Machine Translation

Xing Niu, Prashant Mathur, Georgiana Dinu, Yaser Al-Onaizan

Keywords Paper

Neural Translation, Neural models, subword methods, relative degradation

0

0

0

0

6:55

14/06/2020

What Makes Training Multi-Modal Classification Networks Hard?

Weiyao Wang, Du Tran, Matt Feiszli

Keywords Paper

video classification, multi-modal, overfitting, action recognition, acoustic event detection

0

0

0

0

1:01

06/12/2021

A universal probabilistic spike count model reveals ongoing modulation of neural variability

David Liu, Mate Lengyel

Keywords Paper

generative model, kernel methods

0

0

0

0

15:06

06/12/2020

Trust the Model When It Is Confident: Masked Model-based Actor-Critic

Feiyang Pan, Jia He, Dandan Tu, Qing He

Keywords Paper

0

0

0

0

2:57

06/12/2020

Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies

Itai Gat, Idan Schwartz, Alex Schwing, Tamir Hazan

Keywords Paper

0

0

0

0

3:18

12/07/2020

Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence Functions

Ahmed Alaa, Mihaela van der Schaar

Keywords Paper

Applications - Other

0

0

0

0

14:17

08/12/2020

Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine Translation

Bryan Eikema, Wilker Aziz

Keywords Paper

0

0

0

0

12:03

22/11/2021

Robust channel-wise illumination estimation

Firas Laakom, Jenni Raitoharju, Jarno Nikkanen and
Alexandros Iosifidis, Moncef Gabbouj

Keywords Paper

color constancy, illumination estimation, deep Learning, uncertrainty estimation, regression

0

0

0

0

2:58

06/12/2020

Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect

Kaihua Tang, Jianqiang Huang, hanwang Zhang

Keywords Paper

Deep Learning -> Optimization for Deep Networks, Applications -> Hardware and Systems

0

0

0

1

3:20

06/12/2020

Further Analysis of Outlier Detection with Deep Generative Models

Ziyu Wang, Bin Dai, David P Wipf, Jun Zhu

Keywords Paper

0

0

0

0

2:57

06/12/2021

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Lingxiao Wang, Zhuoran Yang, Zhaoran Wang

Keywords Paper

deep learning, reinforcement learning and planning, causality

0

0

0

0

14:54