On Faithfulness and Factuality in Abstractive Summarization

04/07/2020

On Faithfulness and Factuality in Abstractive Summarization

Joshua Maynez, Shashi Narayan, Bernd Bohnet, Ryan McDonald

Keywords: Abstractive Summarization, likelihood objectives, open-ended tasks, language modeling

Abstract Paper Similar Papers

Abstract: It is well known that the standard likelihood training and approximate decoding objectives in neural text generation models lead to less human-like responses for open-ended tasks such as language modeling and story generation. In this paper we have analyzed limitations of these models for abstractive document summarization and found that these models are highly prone to hallucinate content that is unfaithful to the input document. We conducted a large scale human evaluation of several neural abstractive summarization systems to better understand the types of hallucinations they produce. Our human annotators found substantial amounts of hallucinated content in all model generated summaries. However, our analysis does show that pretrained models are better summarizers not only in terms of raw metrics, i.e., ROUGE, but also in generating faithful and factual summaries as evaluated by humans. Furthermore, we show that textual entailment measures better correlate with faithfulness than standard metrics, potentially leading the way to automatic evaluation metrics as well as training and decoding criteria.

0

0

0

1

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/04/2020

Learning The Difference That Makes A Difference With Counterfactually-Augmented Data

Divyansh Kaushik, Eduard Hovy, Zachary Lipton

Keywords Paper

humans in the loop, annotation artifacts, text classification, sentiment analysis, natural language inference

0

0

0

0

4:25

03/05/2021

Mirostat: A Neural Text Decoding Algorithm That Directly Controls Perplexity

Sourya Basu, Govardana Sachithanandam Ramachandran, Nitish Shirish Keskar, Lav R Varshney

Keywords Paper

cross-entropy, incoherence, repetitions, sampling algorithms, Neural text decoding

0

0

0

0

5:07

01/07/2020

Memory-bounded Neural Incremental Parsing for Psycholinguistic Prediction

Lifeng Jin, William Schuler

Keywords Paper

0

0

0

0

14:19

02/02/2021

Multi-Dimensional Explanation of Target Variables from Documents

Diego Antognini, Claudiu Musat, Boi Faltings

Keywords Paper

0

0

0

0

19:03

04/07/2020

FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization

Esin Durmus, He He, Mona Diab

Keywords Paper

Faithfulness Assessment, Abstractive Summarization, evaluating summary, reading comprehension

0

0

0

1

12:13

06/12/2021

Refining Language Models with Compositional Explanations

Huihan Yao, Ying Chen, Qinyuan Ye and
Xisen Jin, Xiang Ren

Keywords Paper

machine learning, fairness, language

0

0

0

0

13:17

04/07/2020

Towards Transparent and Explainable Attention Models

Akash Kumar Mohankumar, Preksha Nema, Sharan Narasimhan and
Mitesh M. Khapra, Balaji Vasan Srinivasan, Balaraman Ravindran

Keywords Paper

interpretability distributions, attention mechanisms, Human evaluations, Transparent Models

0

0

0

0

11:58

16/11/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Lingkai Kong, Haoming Jiang, Yuchen Zhuang and
Jie Lyu, Tuo Zhao, Chao Zhang

Keywords Paper

augmented training, in-distribution calibration, text classification, expectation error

0

0

0

0

11:47

04/07/2020

Automatic Detection of Generated Text is Easiest when Humans are Fooled

Daphne Ippolito, Daniel Duckworth, Chris Callison-Burch, Douglas Eck

Keywords Paper

Automatic Text, detection, humanness systems, neural modelling

0

0

0

0

11:01

19/04/2021

On hallucination and predictive uncertainty in conditional language generation

Yijun Xiao, William Yang Wang

Keywords Paper

0

0

0

0

11:37

04/07/2020

Max-Margin Incremental CCG Parsing

Miloš Stanojević, Mark Steedman

Keywords Paper

Incremental parsing, human processing, ASR, MT

0

0

0

0

11:39

16/11/2020

CAT-Gen: Improving Robustness in NLP Models via Controlled Adversarial Text Generation

Tianlu Wang, Xuezhi Wang, Yao Qin and
Ben Packer, Kang Li, Jilin Chen, Alex Beutel, Ed Chi

Keywords Paper

sentiment classification, model re-training, nlp models, cat-gen model

0

0

0

0

6:58

19/04/2021

Disambiguatory signals are stronger in word-initial positions

Tiago Pimentel, Ryan Cotterell, Brian Roark

Keywords Paper

0

0

0

0

11:35

03/05/2021

Variational Information Bottleneck for Effective Low-Resource Fine-Tuning

Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson

Keywords Paper

variational information bottleneck, biases, robust, over-fitting, large-scale pre-trained language models, NLP, Transfer learning

0

0

0

0

5:07

26/04/2020

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

energy-based models, text generation

0

0

0

0

4:59

02/02/2021

What's the Best Place for an AI Conference, Vancouver or _______: Why Completing Comparative Questions is Difficult

‪Avishai Zagoury‬, Einat Minkov, Idan Szpektor, William W. Cohen

Keywords Paper

0

0

0

0

15:15

14/06/2020

Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing

Vedika Agarwal, Rakshith Shetty, Mario Fritz

Keywords Paper

robustness, vqa, causality, gan, dataset, evaluation, automated semantic scene editing, data augmentation, invariance, covariance

0

0

0

0

1:00

19/10/2020

Distant supervision in BERT-based adhoc document retrieval

Koustav Rudra, Avishek Anand

Keywords Paper

distant supervision, adhoc retrieval, document ranking

0

0

0

0

6:49

04/07/2020

How does BERT's attention change when you fine-tune? An analysis methodology and a case study in negation scope

Yiyun Zhao, Steven Bethard

Keywords Paper

downstream task, NLP problems, knowledge-related tasks, downstream tasks

0

0

0

0

11:43

26/04/2020

The Curious Case of Neural Text Degeneration

Ari Holtzman, Jan Buys, Li Du and
Maxwell Forbes, Yejin Choi

Keywords Paper

generation, text, NLG, NLP, natural language, natural language generation, language model, neural, neural language model

0

0

0

0

4:57

02/02/2021

HiGAN: Handwriting Imitation Conditioned on Arbitrary-Length Texts and Disentangled Styles

Ji Gan, Weiqiang Wang

Keywords Paper

0

0

0

0

15:15

02/02/2021

TextGAIL: Generative Adversarial Imitation Learning for Text Generation

Qingyang Wu, Lei Li, Zhou Yu

Keywords Paper

0

0

0

0

16:41

03/05/2021

Better Fine-Tuning by Reducing Representational Collapse

Armen Aghajanyan, Akshat Shrivastava, Anchit Gupta and
Naman Goyal, Luke Zettlemoyer, Sonal Gupta

Keywords Paper

nlp, glue, representational learning, finetuning

0

0

0

0

5:06

12/07/2020

Adversarial Filters of Dataset Biases

Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula and
Rowan Zellers, Matthew Peters, Ashish Sabharwal, Yejin Choi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

15:25

04/07/2020

Are we Estimating or Guesstimating Translation Quality?

Shuo Sun, Francisco Guzmán, Lucia Specia

Keywords Paper

Estimating Quality, quality estimation, machine translation, QE task

0

0

0

0

5:56

04/07/2020

Slot-consistent NLG for Task-oriented Dialogue Systems with Iterative Rectification Network

Yangming Li, Kaisheng Yao, Libo Qin and
Wanxiang Che, Xiaolong Li, Ting Liu

Keywords Paper

Task-oriented Systems, natural generation, natural NLG, NLG

0

0

0

0

10:53

02/02/2021

Do Response Selection Models Really Know What’s Next? Utterance Manipulation Strategies for Multi-turn Response Selection

Taesun Whang, Dongyub Lee, Dongsuk Oh and
Chanhee Lee, Kijong Han, Dong-hun Lee, Saebyeok Lee

Keywords Paper

0

0

0

0

17:37

08/12/2020

Multi-task Learning of Spoken Language Understanding by Integrating N-Best Hypotheses with Hierarchical Attention

Mingda Li, Xinyue Liu, Weitong Ruan and
Luca Soldaini, Wael Hamza, Chengwei Su

Keywords Paper

0

0

0

0

14:43

06/12/2021

Learning to Generate Visual Questions with Noisy Supervision

Shen Kai, Lingfei Wu, Siliang Tang and
Yueting Zhuang, zhen he, Zhuoye Ding, Yun Xiao, Bo Long

Keywords Paper

generative model

0

0

0

0

14:54

06/12/2020

Learning to summarize with human feedback

Nisan Stiennon, Long Ouyang, Jeffrey Wu and
Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano

Keywords Paper

0

0

0

0

3:17

06/12/2021

NORESQA: A Framework for Speech Quality Assessment using Non-Matching References

Pranay Manocha, Buye Xu, Anurag Kumar

Keywords Paper

deep learning, robustness, self-supervised learning

0

0

0

0

14:30

08/12/2020

Emergent Communication Pretraining for Few-Shot Machine Translation

Yaoyiran Li, Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen

Keywords Paper

0

0

0

0

14:42

04/07/2020

Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?

Yada Pruksachatkun, Jason Phang, Haokun Liu and
Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

Keywords Paper

Intermediate-Task Learning, natural tasks, data-rich task, intermediate-task training

0

0

0

0

14:47

04/07/2020

Low-Resource Generation of Multi-hop Reasoning Questions

Jianxing Yu, Wei Liu, Shuang Qiu and
Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin

Keywords Paper

Low-Resource Questions, generating questions, machine comprehension, multi-hop model

0

0

0

0

11:54

06/12/2021

Uncertainty Quantification and Deep Ensembles

Rahul Rahaman, alexandre thiery

Keywords Paper

deep learning, machine learning

0

0

0

0

14:40

06/12/2021

Overinterpretation reveals image classification model pathologies

Brandon Carter, Siddhartha Jain, Jonas Mueller, David Gifford

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security, vision, interpretability

0

0

0

0

11:14

16/11/2020

An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction

Bhargavi Paranjape, Mandar Joshi, John Thickstun and
Hannaneh Hajishirzi, Luke Zettlemoyer

Keywords Paper

language understanding, semi-supervised setting, complex models, explainer

0

0

0

0

11:44

02/02/2021

Robustness to Spurious Correlations in Text Classification via Automatically Generated Counterfactuals

Zhao Wang, Aron Culotta

Keywords Paper

0

0

0

0

17:39

02/02/2021

Non-Autoregressive Coarse-to-Fine Video Captioning

Bang Yang, Yuexian Zou, Fenglin Liu, Can Zhang

Keywords Paper

0

0

0

0

18:21

16/11/2020

Evaluating the Factual Consistency of Abstractive Text Summarization

Wojciech Kryscinski, Bryan McCann, Caiming Xiong, Richard Socher

Keywords Paper

assessing algorithms, natural inference, fact checking, auxiliary tasks

0

0

0

0

12:05