NoiseQA: Challenge set evaluation for user-centric question answering

19/04/2021

NoiseQA: Challenge set evaluation for user-centric question answering

Abhilasha Ravichander, Siddharth Dalmia, Maria Ryskina, Florian Metze, Eduard Hovy, Alan W Black

Keywords:

Abstract Paper Similar Papers

Abstract: When Question-Answering (QA) systems are deployed in the real world, users query them through a variety of interfaces, such as speaking to voice assistants, typing questions into a search engine, or even translating questions to languages supported by the QA system. While there has been significant community attention devoted to identifying correct answers in passages assuming a perfectly formed question, we show that components in the pipeline that precede an answering engine can introduce varied and considerable sources of error, and performance can degrade substantially based on these upstream noise sources even for powerful pre-trained QA models. We conclude that there is substantial room for progress before QA systems can be effectively deployed, highlight the need for QA evaluation to expand to consider real-world use, and hope that our findings will spur greater community interest in the issues that arise when our systems actually need to be of utility to humans.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EACL 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

25/04/2020

An Honest Conversation: Transparently Combining Machine and Human Speech Assistance in Public Spaces

Thomas Reitmaier, Simon Robinson, Jennifer Pearson and
Dani Kalarikalayil Raju, Matt Jones

Keywords Paper

conversational agents, speech appliances, public space interaction, emergent users

0

0

0

0

15:04

25/04/2020

A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores

Maria De-Arteaga, Riccardo Fogliato, Alexandra Chouldechova

Keywords Paper

human-in-the-loop, decision support, algorithm aversion, automation bias, algorithm assisted decision making, child welfare

0

0

0

0

15:04

04/07/2020

What Question Answering can Learn from Trivia Nerds

Jordan Boyd-Graber, Benjamin Börschinger

Keywords Paper

Question Answering, machines questions, QA, QA research

0

0

0

0

12:03

16/11/2020

F1 is Not Enough! Models and Evaluation Towards User-Centered Explainable Question Answering

Hendrik Schuff, Heike Adel, Ngoc Thang Vu

Keywords Paper

reasoning process, user study, model selection, explainable systems

0

0

0

0

12:03

16/11/2020

Unsupervised Quality Estimation for Neural Machine Translation

Marina Fomicheva, Shuo Sun, Lisa Yankovskaya and
Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia

Keywords Paper

machine mt, real-world applications, qe, uncertainty quantification

0

0

1

0

12:19

16/11/2020

What Does My QA Model Know? Devising Controlled Probes using Expert

Kyle Richardson, Ashish Sabharwal

Keywords Paper

knowledge challenges, benchmark tasks, diagnostic tasks, taxonomic reasoning

0

0

0

0

12:16

05/12/2020

MetaCAT: A metadata-based task-oriented chatbot annotation tool

Ximing Liu, Wei Xue, Qi Su and
Weiran Nie, Wei Peng

Keywords Paper

0

0

0

0

11:17

04/07/2020

uBLEU: Uncertainty-Aware Automatic Evaluation Method for Open-Domain Dialogue Systems

Tsuta Yuma, Naoki Yoshinaga, Masashi Toyoda

Keywords Paper

Open-Domain Systems, uBLEU, Uncertainty-Aware Method, ΔBLEU

0

0

0

0

11:07

16/11/2020

Interpretable Multi-dataset Evaluation for Named Entity Recognition

Jinlan Fu, Pengfei Liu, Graham Neubig

Keywords Paper

natural tasks, interpretable evaluation, named task, analysis tool

0

0

0

0

11:11

19/04/2021

Do multi-hop question answering systems know how to answer the single-hop sub-questions?

Yixuan Tang, Hwee Tou Ng, Anthony Tung

Keywords Paper

0

0

0

0

6:23

16/11/2020

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations

Wanrong Zhu, Xin Wang, Pradyumna Narayana and
Kazoo Sone, Sugato Basu, William Yang Wang

Keywords Paper

visually generation, vision-and-language tasks, cider, utilities

0

0

0

0

6:47

04/07/2020

Automatic Detection of Generated Text is Easiest when Humans are Fooled

Daphne Ippolito, Daniel Duckworth, Chris Callison-Burch, Douglas Eck

Keywords Paper

Automatic Text, detection, humanness systems, neural modelling

0

0

0

0

11:01

01/07/2020

Is Your Goal-Oriented Dialog Model Performing Really Well? Empirical Analysis of System-wise Evaluation

Ryuichi Takanobu, Qi Zhu, Jinchao Li and
Baolin Peng, Jianfeng Gao, Minlie Huang

Keywords Paper

0

0

0

0

11:39

16/11/2020

Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization

Jiaao Chen, Diyi Yang

Keywords Paper

text summarization, nlp, summarizing text, human-humanmachine interaction

0

0

0

0

12:02

14/09/2020

AMQAN: Adaptive Multi-Attention Question-Answer Networks for Answer Selection

Haitian Yang, Xuan Zhao, Yan Wang and
Bin Lv, Rui Mao, Ning Li, Yuyan Chen

Keywords Paper

answer selection, adaptive multi-attention, community question answering

0

0

0

0

13:28

04/07/2020

Out of the Echo Chamber: Detecting Countering Debate Speeches

Matan Orbach, Yonatan Bilu, Assaf Toledo and
Dan Lahav, Michal Jacovi, Ranit Aharonov, Noam Slonim

Keywords Paper

Detecting Speeches, Echo Chamber, echo chambers, opposing stance

0

0

0

0

12:02

02/02/2021

Reinforced History Backtracking for Conversational Question Answering

Minghui Qiu, Xinjing Huang, Cen Chen and
Feng Ji, Chen Qu, Wei Wei, Jun Huang, Yin Zhang

Keywords Paper

0

0

0

0

14:09

16/11/2020

Dialogue Response Ranking Training with Large-Scale Human Feedback Data

Xiang Gao, Yizhe Zhang, Michel Galley and
Chris Brockett, Bill Dolan

Keywords Paper

feedback prediction, ranking problem, predicting feedback, open-domain models

0

0

0

0

11:57

19/08/2021

Person Search Challenges and Solutions: A Survey

Xiangtan Lin, Pengzhen Ren, Yun Xiao and
Xiaojun Chang, Alex Hauptmann

Keywords Paper

Computer vision, General, General, General

0

0

0

0

13:56

16/11/2020

Counterfactual Off-Policy Training for Neural Dialogue Generation

Qingfu Zhu, Wei-Nan Zhang, Ting Liu, William Yang Wang

Keywords Paper

open-domain generation, data problem, training, counterfactual reasoning

0

0

0

0

11:37

05/12/2020

A survey of the state of explainable AI for natural language processing

Marina Danilevsky, Kun Qian, Ranit Aharonov and
Yannis Katsis, Ban Kawas, Prithviraj Sen

Keywords Paper

0

0

0

0

14:13

08/12/2020

Interactive Question Clarification in Dialogue via Reinforcement Learning

Xiang Hu, Zujie Wen, Yafang Wang and
Xiaolong Li, Gerard de Melo

Keywords Paper

0

0

0

0

14:46

19/08/2021

Automated Facilitation Support in Online Forum

Wen Gu

Keywords Paper

Multidisciplinary Topics and Applications, Social Sciences, Knowledge-based Software Engineering, Reasoning about Knowledge and Belief

0

0

0

0

13:48

25/07/2020

GoChat: Goal-oriented chatbots with hierarchical reinforcement learning

Jianfeng Liu, Feiyang Pan, Ling Luo

Keywords Paper

dialogue system, reinforcement learning, goal-oriented chatbot

0

0

0

0

9:15

16/11/2020

Facilitating the Communication of Politeness through Fine-Grained Paraphrasing

Liye Fu, Susan Fussell, Cristian Danescu-Niculescu-Mizil

Keywords Paper

communication approaches, paraphrases, speaker intentions, technology

0

0

0

0

11:38

16/11/2020

What is More Likely to Happen Next? Video-and-Language Future Event Prediction

Jie Lei, Licheng Yu, Tamara Berg, Mohit Bansal

Keywords Paper

video-and-language prediction, ai models, vlep, adversarial procedure

0

0

0

0

11:58

04/07/2020

Learning an Unreferenced Metric for Online Dialogue Evaluation

Koustuv Sinha, Prasanna Parthasarathi, Jasmine Wang and
Ryan Lowe, William L. Hamilton, Joelle Pineau

Keywords Paper

Online Evaluation, inference, online setting, Unreferenced Metric

0

0

0

0

6:58

04/07/2020

More Diverse Dialogue Datasets via Diversity-Informed Data Collection

Katherine Stasaski, Grace Hui Yang, Marti A. Hearst

Keywords Paper

Automated dialogue, diversity problem, Diversity-Informed Collection, emotion classification

0

0

0

0

12:06

16/11/2020

An Imitation Game for Learning Semantic Parsers from User Interaction

Ziyu Yao, Yiqi Tang, Wen-tau Yih and
Huan Sun, Yu Su

Keywords Paper

bootstrapping, fine-tuning parsers, theoretical analysis, text-to-sql problem

0

0

0

0

11:49

16/11/2020

A Diagnostic Study of Explainability Techniques for Text Classification

Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

Keywords Paper

downstream tasks, machine learning, explainability techniques, diverse techniques

0

0

0

0

11:24

23/08/2020

Towards building an intelligent chatbot for customer service: Learning to respond at the appropriate time

Che Liu, Junfeng Jiang, Chao Xiong and
Yi Yang, Jieping Ye

Keywords Paper

customer service, triggering model, chatbot, self-supervised learning

0

0

0

0

10:34

08/12/2020

Explainable Automated Fact-Checking: A Survey

Neema Kotonya, Francesca Toni

Keywords Paper

0

0

0

0

13:26

19/04/2021

Recipes for building an open-domain chatbot

Stephen Roller, Emily Dinan, Naman Goyal and
Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Eric Michael Smith, Y-Lan Boureau, Jason Weston

Keywords Paper

0

0

0

1

11:33

19/10/2020

A toolkit for managing multiple crowdsourced top-k queries

Caihua Shan, Leong Hou U, Nikos Mamoulis, Reynold Cheng

Keywords Paper

top-k query, crowdsourcing, query management

0

0

0

0

5:01

04/07/2020

Reverse Engineering Configurations of Neural Text Generation Models

Yi Tay, Dara Bahri, Che Zheng and
Clifford Brunk, Donald Metzler, Andrew Tomkins

Keywords Paper

Reverse Models, neural modeling, Neural Models, generative models

0

0

0

0

6:16

04/07/2020

Fluent Response Generation for Conversational Question Answering

Ashutosh Baheti, Alan Ritter, Kevin Small

Keywords Paper

Fluent Generation, Conversational Answering, Question answering, Question QA

1

0

0

0

11:29

16/11/2020

AnswerFact: Fact Checking in Product Question Answering

Wenxuan Zhang, Yang Deng, Jing Ma, Wai Lam

Keywords Paper

online shopping, evidence-based tasks, answer problem, product-related platforms

0

0

0

0

12:15

04/07/2020

Unsupervised Opinion Summarization with Noising and Denoising

Reinald Kim Amplayo, Mirella Lapata

Keywords Paper

Unsupervised Summarization, supervised models, abstractive summarization, Noising

0

0

0

0

12:16

06/12/2021

Is Automated Topic Model Evaluation Broken? The Incoherence of Coherence

Alexander Hoyle, Pranav Goel, Andrew Hian-Cheong and
Denis Peskov, Jordan Boyd-Graber, Philip Resnik

Keywords Paper

0

0

0

0

15:00

04/07/2020

How to Ask Good Questions? Try to Leverage Paraphrases

Xin Jia, Wenjie Zhou, Xu Sun, Yunfang Wu

Keywords Paper

question generation(QG, sentence-level generation, diversity training, Paraphrases

0

0

0

0

10:13