Improving Image Captioning Evaluation by Considering Inter References Variance

04/07/2020

Improving Image Captioning Evaluation by Considering Inter References Variance

Yanzhi Yi, Hangyu Deng, Jinglu Hu

Keywords: Image Evaluation, Evaluating captions, system-level tasks, BERTScore

Abstract Paper Similar Papers

Abstract: Evaluating image captions is very challenging partially due to the fact that there are multiple correct captions for every single image. Most of the existing one-to-one metrics operate by penalizing mismatches between reference and generative caption without considering the intrinsic variance between ground truth captions. It usually leads to over-penalization and thus a bad correlation to human judgment. Recently, the latest one-to-one metric BERTScore can achieve high human correlation in system-level tasks while some issues can be fixed for better performance. In this paper, we propose a novel metric based on BERTScore that could handle such a challenge and extend BERTScore with a few new features appropriately for image captioning evaluation. The experimental results show that our metric achieves state-of-the-art human judgment correlation.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

SAM: The Sensitivity of Attribution Methods to Hyperparameters

Naman Bansal, Chirag Agarwal, Anh Nguyen

Keywords Paper

xai, explainable, attribution, sensitivity, robustness, explanation, hyperparameters

0

0

0

0

8:50

14/06/2020

Counterfactual Vision and Language Learning

Ehsan Abbasnejad, Damien Teney, Amin Parvaneh and
Javen Shi, Anton van den Hengel

Keywords Paper

counterfactual reasoning vision and language tasks vqa

0

0

0

0

5:00

06/12/2021

Relative Uncertainty Learning for Facial Expression Recognition

Yuhang Zhang, Chengrui Wang, Weihong Deng

Keywords Paper

0

0

0

0

8:12

14/06/2020

RiFeGAN: Rich Feature Generation for Text-to-Image Synthesis From Prior Knowledge

Jun Cheng, Fuxiang Wu, Yanling Tian and
Lei Wang, Dapeng Tao

Keywords Paper

image synthesis, self-attentional embedding mixture, multi-captions, limited information, caption matching

0

0

0

0

1:01

06/12/2021

Learning to Generate Visual Questions with Noisy Supervision

Shen Kai, Lingfei Wu, Siliang Tang and
Yueting Zhuang, zhen he, Zhuoye Ding, Yun Xiao, Bo Long

Keywords Paper

generative model

0

0

0

0

14:54

19/04/2021

‘just because you are right, doesn’t mean I am wrong’: Overcoming a bottleneck in development and evaluation of open-ended VQA tasks

Man Luo, Shailaja Keyur Sampat, Riley Tallman and
Yankai Zeng, Manuha Vancha, Akarshan Sajja, Chitta Baral

Keywords Paper

0

0

0

0

7:10

02/02/2021

RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER

Lin Sun, Jiquan Wang, Kai Zhang and
Yindu Su, Fangsheng Weng

Keywords Paper

0

0

0

0

17:21

05/01/2021

Saliency Driven Perceptual Image Compression

Yash Patel, Srikar Appalaraju, R. Manmatha

Keywords Paper

0

0

0

0

4:58

14/06/2020

On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

Xinyu Wang, Yuliang Liu, Chunhua Shen and
Chun Chet Ng, Canjie Luo, Lianwen Jin, Chee Seng Chan, Anton van den Hengel, Liangwei Wang

Keywords Paper

visual question answering, scene text, ocr

0

0

0

0

1:01

04/07/2020

Towards Transparent and Explainable Attention Models

Akash Kumar Mohankumar, Preksha Nema, Sharan Narasimhan and
Mitesh M. Khapra, Balaji Vasan Srinivasan, Balaraman Ravindran

Keywords Paper

interpretability distributions, attention mechanisms, Human evaluations, Transparent Models

0

0

0

0

11:58

22/11/2021

Discriminative Clue Alignment Network for Both Image- and Video-Based Person Re-Identification

Panwen Hu, Xinyu Zhou, Rui Huang

Keywords Paper

person reidentification, feature alignment, multiple attention

0

0

0

0

3:04

14/09/2020

On Saliency Maps and Adversarial Robustness

Puneet Mangla, Vedant Singh, Vineeth Balasubramanian

Keywords Paper

adversarial robustness, saliency maps, deep neural networks

0

0

0

0

17:29

05/01/2021

Attention-Based Spatial Guidance for Image-to-Image Translation

Yu Lin, Yigong Wang, Yifan Li and
Yang Gao, Zhuoyi Wang, Latifur Khan

Keywords Paper

0

0

0

0

4:57

19/04/2021

Removing word-level spurious alignment between images and pseudo-captions in unsupervised image captioning

Ukyo Honda, Yoshitaka Ushiku, Atsushi Hashimoto and
Taro Watanabe, Yuji Matsumoto

Keywords Paper

0

0

0

0

12:30

12/07/2020

Reliable Fidelity and Diversity Metrics for Generative Models

Muhammad Ferjad Naeem, Seong Joon Oh, Yunjey Choi and
Youngjung Uh, Jaejun Yoo

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

15:44

14/06/2020

Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing

Vedika Agarwal, Rakshith Shetty, Mario Fritz

Keywords Paper

robustness, vqa, causality, gan, dataset, evaluation, automated semantic scene editing, data augmentation, invariance, covariance

0

0

0

0

1:00

06/12/2021

Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

Nicklas Hansen, Hao Su, Xiaolong Wang

Keywords Paper

reinforcement learning and planning, transformers

0

0

0

0

8:43

02/02/2021

Appearance-Motion Memory Consistency Network for Video Anomaly Detection

Ruichu Cai, Hao Zhang, Wen Liu and
Shenghua Gao, Zhifeng Hao

Keywords Paper

0

0

0

0

19:08

06/12/2021

Overinterpretation reveals image classification model pathologies

Brandon Carter, Siddhartha Jain, Jonas Mueller, David Gifford

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security, vision, interpretability

0

0

0

0

11:14

14/06/2020

SER-FIQ: Unsupervised Estimation of Face Image Quality Based on Stochastic Embedding Robustness

Philipp Terhörst, Jan Niklas Kolf, Naser Damer and
Florian Kirchbuchner, Arjan Kuijper

Keywords Paper

image quality, face recognition, biometrics

0

0

0

0

1:01

16/11/2020

Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!

Jack Hessel, Lillian Lee

Keywords Paper

modeling interactions, multimodal tasks, visual answering, multimodal learning

0

0

0

0

12:02

14/06/2020

More Grounded Image Captioning by Distilling Image-Text Matching Model

Yuanen Zhou, Meng Wang, Daqing Liu and
Zhenzhen Hu, Hanwang Zhang

Keywords Paper

grounded image captioning, image-text matching, visual grounding, cross-task knowledge distillation

0

0

0

0

1:01

02/02/2021

Multi-Dimensional Explanation of Target Variables from Documents

Diego Antognini, Claudiu Musat, Boi Faltings

Keywords Paper

0

0

0

0

19:03

19/08/2021

Feature Space Targeted Attacks by Statistic Alignment

Lianli Gao, Yaya Cheng, Qilong Zhang and
Xing Xu, Jingkuan Song

Keywords Paper

Computer Vision, 2D and 3D Computer Vision, Recognition, Adversarial Machine Learning

0

0

0

0

12:17

22/11/2021

KonIQ++: Boosting No-Reference Image Quality Assessment in the Wild by Jointly Predicting Image Quality and Defects

Shaolin Su, Vlad Hosu, Hanhe Lin and
Yanning Zhang, Dietmar Saupe

Keywords Paper

image quality assessment, database, defects

0

0

0

0

7:54

06/12/2021

Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions

Huan Ma, Zongbo Han, Changqing Zhang and
Huazhu Fu, Joey Tianyi Zhou, Qinghua Hu

Keywords Paper

0

0

0

0

5:37

06/12/2021

Debiased Visual Question Answering from Feature and Sample Perspectives

Zhiquan Wen, Guanghui Xu, Mingkui Tan and
Qingyao Wu, Qi Wu

Keywords Paper

vision

0

0

0

0

11:20

03/05/2021

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Shengyu Zhao, Jonathan Cui, Yilun Sheng and
Yue Dong, Xiao Liang, Eric Chang, Yan Xu

Keywords Paper

co-modulation, image completion, generative adversarial networks

0

0

0

0

10:10

16/11/2020

Towards Debiasing NLU Models from Unknown Biases

Prasetya Ajie Utama, Nafise Sadat Moosavi, Iryna Gurevych

Keywords Paper

nlu tasks, nlu models, debiasing methods, self-debiasing framework

0

0

0

0

10:40

12/07/2020

On the consistency of top-k surrogate losses

Forest Yang, Sanmi Koyejo

Keywords Paper

Learning Theory

0

0

0

0

15:54

16/11/2020

An Unsupervised Sentence Embedding Method by Mutual Information Maximization

Yan Zhang, Ruidan He, Zuozhu Liu and
Kwan Hui Lim, Lidong Bing

Keywords Paper

sentence-pair tasks, clustering, semantic search, downstream tasks

0

0

0

0

12:22

06/12/2021

On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness

Eric Mintun, Alexander Kirillov, Saining Xie

Keywords Paper

deep learning, robustness, vision

0

0

0

0

12:36

16/11/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Lingkai Kong, Haoming Jiang, Yuchen Zhuang and
Jie Lyu, Tuo Zhao, Chao Zhang

Keywords Paper

augmented training, in-distribution calibration, text classification, expectation error

0

0

0

0

11:47

04/07/2020

Syntax-Aware Opinion Role Labeling with Dependency Graph Convolutional Networks

Bo Zhang, Yue Zhang, Rui Wang and
Zhenghua Li, Min Zhang

Keywords Paper

Syntax-Aware Labeling, Opinion labeling, ORL, opinion task

0

0

0

0

11:47

06/12/2021

How Well do Feature Visualizations Support Causal Understanding of CNN Activations?

Roland S. Zimmermann, Judy Borowski, Robert Geirhos and
Matthias Bethge, Thomas Wallis, Wieland Brendel

Keywords Paper

interpretability

0

0

0

0

11:49

05/01/2021

Ensembling Low Precision Models for Binary Biomedical Image Segmentation

Tianyu Ma, Hang Zhang, Hanley Ong and
Amar Vora, Thanh D. Nguyen, Ajay Gupta, Yi Wang, Mert R. Sabuncu

Keywords Paper

0

0

0

0

4:53

30/11/2020

Jointly Discriminating and Frequent Visual Representation Mining

Qiannan Wang, Ying Zhou, ZhaoYan Zhu and
Xuefeng Liang, Yu Gu

Keywords Paper

0

0

0

0

8:13

14/06/2020

Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension

Zhenfang Chen, Peng Wang, Lin Ma and
Kwan-Yee K. Wong, Qi Wu

Keywords Paper

compositional referring expression comprehension, visual reasoning

0

0

0

0

1:00

22/11/2021

Looking at the whole picture: constrained unsupervised anomaly segmentation

Julio Silva-Rodríguez, Valery Naranjo, Jose Dolz

Keywords Paper

unsueprvised anomaly localization, brain lesion segmentation, constrained segmentation, size-constrained loss, class-activations maps, CAMs, log-barrier extension, BRATS19

0

0

0

0

2:57

30/11/2020

Visualizing Color-wise Saliency of Black-Box Image Classification Models

Yuhki Hatakeyama, Hiroki Sakuma, Yoshinori Konishi, Kohei Suenaga

Keywords Paper

0

0

0

0

9:43