On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

14/06/2020

On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

Xinyu Wang, Yuliang Liu, Chunhua Shen, Chun Chet Ng, Canjie Luo, Lianwen Jin, Chee Seng Chan, Anton van den Hengel, Liangwei Wang

Keywords: visual question answering, scene text, ocr

Abstract Paper Similar Papers

Abstract: Visual Question Answering (VQA) methods have made incredible progress, but suffer from a failure to generalize. This is visible in the fact that they are vulnerable to learning coincidental correlations in the data rather than deeper relations between image content and ideas expressed in language. We present a dataset that takes a step towards addressing this problem in that it contains questions expressed in two languages, and an evaluation process that co-opts a well understood image-based metric to reflect the methods ability to reason. Measuring reasoning directly encourages generalization by penalizing answers that are coincidentally correct. The dataset reflects the scene-text version of the VQA problem, and the reasoning evaluation can be seen as a text-based version of a referring expression challenge. Experiments and analyses are provided that show the value of the dataset. The dataset is available at www.est-vqa.org.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CVPR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

Counterfactual Vision and Language Learning

Ehsan Abbasnejad, Damien Teney, Amin Parvaneh and
Javen Shi, Anton van den Hengel

Keywords Paper

counterfactual reasoning vision and language tasks vqa

0

0

0

0

5:00

06/12/2021

Debiased Visual Question Answering from Feature and Sample Perspectives

Zhiquan Wen, Guanghui Xu, Mingkui Tan and
Qingyao Wu, Qi Wu

Keywords Paper

vision

0

0

0

0

11:20

02/02/2021

A Case Study of the Shortcut Effects in Visual Commonsense Reasoning

Keren Ye, Adriana Kovashka

Keywords Paper

0

0

0

0

14:26

14/06/2020

Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing

Vedika Agarwal, Rakshith Shetty, Mario Fritz

Keywords Paper

robustness, vqa, causality, gan, dataset, evaluation, automated semantic scene editing, data augmentation, invariance, covariance

0

0

0

0

1:00

04/07/2020

Syntax-Aware Opinion Role Labeling with Dependency Graph Convolutional Networks

Bo Zhang, Yue Zhang, Rui Wang and
Zhenghua Li, Min Zhang

Keywords Paper

Syntax-Aware Labeling, Opinion labeling, ORL, opinion task

0

0

0

0

11:47

06/12/2021

Learning to Generate Visual Questions with Noisy Supervision

Shen Kai, Lingfei Wu, Siliang Tang and
Yueting Zhuang, zhen he, Zhuoye Ding, Yun Xiao, Bo Long

Keywords Paper

generative model

0

0

0

0

14:54

08/12/2020

Learning to Decouple Relations: Few-Shot Relation Classification with Entity-Guided Attention and Confusion-Aware Training

Yingyao Wang, Junwei Bao, Guangyi Liu and
Youzheng Wu, Xiaodong He, Bowen Zhou, Tiejun Zhao

Keywords Paper

0

0

0

0

10:55

19/04/2021

‘just because you are right, doesn’t mean I am wrong’: Overcoming a bottleneck in development and evaluation of open-ended VQA tasks

Man Luo, Shailaja Keyur Sampat, Riley Tallman and
Yankai Zeng, Manuha Vancha, Akarshan Sajja, Chitta Baral

Keywords Paper

0

0

0

0

7:10

26/04/2020

CLEVRER: Collision Events for Video Representation and Reasoning

Kexin Yi, Chuang Gan, Yunzhu Li and
Pushmeet Kohli, Jiajun Wu, Antonio Torralba, Joshua B. Tenenbaum

Keywords Paper

Neuro-symbolic, Reasoning

0

0

0

0

4:53

06/12/2021

Designing Counterfactual Generators using Deep Model Inversion

Jayaraman Thiagarajan, Vivek Sivaraman Narayanaswamy, Deepta Rajan and
Jia Liang, Akshay Chaudhari, Andreas Spanias

Keywords Paper

optimization, representation learning, interpretability

0

0

0

0

13:28

22/11/2021

Discriminative Clue Alignment Network for Both Image- and Video-Based Person Re-Identification

Panwen Hu, Xinyu Zhou, Rui Huang

Keywords Paper

person reidentification, feature alignment, multiple attention

0

0

0

0

3:04

14/06/2020

Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension

Zhenfang Chen, Peng Wang, Lin Ma and
Kwan-Yee K. Wong, Qi Wu

Keywords Paper

compositional referring expression comprehension, visual reasoning

0

0

0

0

1:00

16/11/2020

Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data

Shachar Rosenman, Alon Jacovi, Yoav Goldberg

Keywords Paper

data process, re collection, sota models, tacred

0

0

0

0

5:55

12/07/2020

Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"

Saeed Amizadeh, Hamid Palangi, Oleksandr Polozov and
Yichen Huang, Kazuhito Koishida

Keywords Paper

Applications - Computer Vision

0

0

0

0

10:29

06/12/2021

Supervising the Transfer of Reasoning Patterns in VQA

Corentin Kervadec, Christian Wolf, Grigory Antipov and
Moez Baccouche, Madiha Nadri

Keywords Paper

theory, deep learning, vision

0

0

0

0

12:54

07/09/2020

From Saturation to Zero-Shot Visual Relationship Detection Using Local Context

Nikolaos Gkanatsios, Vassilis Pitsikalis, Petros Maragos

Keywords Paper

Visual Relationship Detection, Scene Graph Generation, Zero-shot Classification, Local Context, Language Bias

0

0

0

0

7:17

04/07/2020

Improving Image Captioning Evaluation by Considering Inter References Variance

Yanzhi Yi, Hangyu Deng, Jinglu Hu

Keywords Paper

Image Evaluation, Evaluating captions, system-level tasks, BERTScore

0

0

0

0

11:31

16/11/2020

Learning to Contrast the Counterfactual Samples for Robust Visual Question Answering

Zujie Liang, Weitao Jiang, Haifeng Hu, Jiaying Zhu

Keywords Paper

visual, generating samples, augmentation, self-supervised mechanism

0

0

0

0

2:00

22/11/2021

Duplicate Latent Representation Suppression for Multi-object Variational Autoencoders

Li Nanbo, Robert B Fisher

Keywords Paper

object-centric representation learning, variational autoencoders, scene representation

0

0

0

0

2:58

07/09/2020

Object Detection as a Positive-Unlabeled Problem

Yuewei Yang, Kevin Liang, Lawrence Carin Duke

Keywords Paper

object detections, positive unlabeled learning

0

0

0

0

8:54

02/02/2021

Visual Relation Detection using Hybrid Analogical Learning

Kezhen Chen, Ken Forbus

Keywords Paper

0

0

0

0

18:14

26/04/2020

Learning The Difference That Makes A Difference With Counterfactually-Augmented Data

Divyansh Kaushik, Eduard Hovy, Zachary Lipton

Keywords Paper

humans in the loop, annotation artifacts, text classification, sentiment analysis, natural language inference

0

0

0

0

4:25

06/12/2021

Making a (Counterfactual) Difference One Rationale at a Time

Mitchell Plyler, Michael Green, Min Chi

Keywords Paper

theory, generative model, language, interpretability

0

0

0

0

13:57

26/08/2020

Robust Learning from Discriminative Feature Feedback

Sanjoy Dasgupta, Sivan Sabato

Keywords Paper

0

0

0

0

14:37

23/08/2020

Targeted data-driven regularization for out-of-distribution generalization

Mohammad Mahdi Kamani, Sadegh Farhang, Mehrdad Mahdavi, James Z. Wang

Keywords Paper

data-driven regularization, out-of-distribution generalization, bilevel programming

0

0

0

0

6:36

06/12/2020

Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning

Weili Nie, Zhiding Yu, Lei Mao and
Ankit Patel, Yuke Zhu, Anima Anandkumar

Keywords Paper

0

0

0

0

3:23

14/06/2020

Attack to Explain Deep Representation

Mohammad A. A. K. Jalwana, Naveed Akhtar, Mohammed Bennamoun, Ajmal Mian

Keywords Paper

interpreting deep learning, adversarial attack, explanation attack, explainable ai, image generation

0

0

0

0

1:01

08/12/2020

EmpDG: Multi-resolution Interactive Empathetic Dialogue Generation

Qintong Li, Hongshen Chen, Zhaochun Ren and
Pengjie Ren, Zhaopeng Tu, Zhumin Chen

Keywords Paper

0

0

0

0

14:43

06/12/2021

Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

Nicklas Hansen, Hao Su, Xiaolong Wang

Keywords Paper

reinforcement learning and planning, transformers

0

0

0

0

8:43

04/07/2020

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

Alex Wang, Kyunghyun Cho, Mike Lewis

Keywords Paper

summarization, automatic protocol, automatically text, abstractive models

0

0

0

0

12:14

04/07/2020

Don’t Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training

Margaret Li, Stephen Roller, Ilia Kulikov and
Sean Welleck, Y-Lan Boureau, Kyunghyun Cho, Jason Weston

Keywords Paper

dialogue tasks, Unlikelihood Training, Generative models, maximum training

0

0

0

0

11:26

02/02/2021

LIREx: Augmenting Language Inference with Relevant Explanations

Xinyan Zhao, V.G.Vinod Vydiswaran

Keywords Paper

0

0

0

0

18:56

19/08/2021

Explaining Self-Supervised Image Representations with Visual Probing

Dominika Basaj, Witold Oleszkiewicz, Igor Sieradzki and
Michał Górszczak, Barbara Rychalska, Tomasz Trzcinski, Bartosz Zieliński

Keywords Paper

Computer Vision, Language and Vision, Unsupervised Learning, Explainability

0

0

0

0

11:03

06/12/2020

On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law

Damien Teney, Ehsan Abbasnejad, Kushal Kafle and
Robik Shrestha, Christopher Kanan, Anton van den Hengel

Keywords Paper

0

0

0

0

3:21

25/07/2020

Deep critiquing for VAE-based recommender systems

Kai Luo, Hojin Yang, Ga Wu, Scott Sanner

Keywords Paper

deep learning, recommender systems, critiquing

0

0

0

0

14:15

03/05/2021

Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System

Jianhong Wang, Yuan Zhang, Tae-Kyun Kim, Yunjie Gu

Keywords Paper

Task-oriented Dialogue System, Hierarchical Reinforcement Learning, Policy Optimization, Natural Language Processing

0

0

0

0

5:44

04/07/2020

History for Visual Dialog: Do we really need it?

Shubham Agarwal, Trung Bui, Joon-Young Lee and
Ioannis Konstas, Verena Rieser

Keywords Paper

Visual Dialogue, Visual Dialog, co-attention models, crowdsourcing procedure

0

0

0

0

11:35

14/06/2020

More Grounded Image Captioning by Distilling Image-Text Matching Model

Yuanen Zhou, Meng Wang, Daqing Liu and
Zhenzhen Hu, Hanwang Zhang

Keywords Paper

grounded image captioning, image-text matching, visual grounding, cross-task knowledge distillation

0

0

0

0

1:01

19/08/2021

A Description Logic for Analogical Reasoning

Steven Schockaert, Yazmin Ibanez-Garcia, Victor Gutierrez-Basulto

Keywords Paper

Knowledge Representation and Reasoning, Common-Sense Reasoning, Description Logics and Ontologies

0

0

0

0

12:47

16/11/2020

Pareto Probing: Trading Off Accuracy for Complexity

Tiago Pimentel, Naomi Saphra, Adina Williams, Ryan Cotterell

Keywords Paper

simplistic tasks, pos labeling, dependency labeling, full parsing

0

0

0

0

13:03