Neighbours Matter: Image Captioning with Similar Images

07/09/2020

Neighbours Matter: Image Captioning with Similar Images

Qingzhong Wang, Jiuniu Wang, Antoni Chan, Siyu Huang, Haoyi Xiong, Xingjian Li, Dejing Dou

Keywords: Image captioning, graph neural networks, attention mechanism

Abstract Paper Similar Papers

Abstract: Most image captioning models aim to generate captions based solely on the input image. However images that are similar to the given input image contain variations of the same or similar concepts as the input image. Thus, aggregating information over similar images could be used to improve image captioning models, by strengthening or inferring concepts that are in the input image. In this paper, we propose an image captioning model based on KNN graphs composed of the input image and its similar images, where each node denotes an image or a caption. An attention-in-attention (AiA) model is developed to refine the node representations. Using the refined features significantly improves the baseline performance, eg, CIDEr score obtained by Updown model increases from 120.1 to 125.6. Compared with the state-of-the-art performance, our proposed method obtains 129.3 of CIDEr and 22.6 of SPICE on Karpathy's test split, which is competitive with the models that employ fine-grained image features such as scene graphs and image parsing trees.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at BMVC 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

Normalizing Flows With Multi-Scale Autoregressive Priors

Apratim Bhattacharyya, Shweta Mahajan, Mario Fritz and
Bernt Schiele, Stefan Roth

Keywords Paper

generative models, normalizing flows, autoregressive models, exact inference, image synthesis

0

0

0

0

1:00

14/06/2020

PointRend: Image Segmentation As Rendering

Alexander Kirillov, Yuxin Wu, Kaiming He, Ross Girshick

Keywords Paper

instance segmentation, semantic segmentation, high-resolution, implicit function, rendering

0

0

0

0

4:59

22/11/2021

Image-Text Alignment using Adaptive Cross-attention with Transformer Encoder for Scene Graphs

Juyong Song, Sunghyun Choi

Keywords Paper

cross-attention, multi-modal, retrieval, scene-graphs, graph neural networks, contrastive loss

0

0

0

0

3:01

14/06/2020

RiFeGAN: Rich Feature Generation for Text-to-Image Synthesis From Prior Knowledge

Jun Cheng, Fuxiang Wu, Yanling Tian and
Lei Wang, Dapeng Tao

Keywords Paper

image synthesis, self-attentional embedding mixture, multi-captions, limited information, caption matching

0

0

0

0

1:01

19/04/2021

Cognition-aware cognate detection

Diptesh Kanojia, Prashant Sharma, Sayali Ghodekar and
Pushpak Bhattacharyya, Gholamreza Haffari, Malhar Kulkarni

Keywords Paper

0

0

0

0

8:53

22/11/2021

C4Net: Contextual Compression and Complementary Combination Network for Salient Object Detection

Hazarapet Tunanyan

Keywords Paper

salient object detection, c4net, excessiveness loss, complementary combination

0

0

0

0

3:04

14/06/2020

Analyzing and Improving the Image Quality of StyleGAN

Tero Karras, Samuli Laine, Miika Aittala and
Janne Hellsten, Jaakko Lehtinen, Timo Aila

Keywords Paper

generative modeling, image synthesis, representation learning

0

0

0

0

1:01

06/12/2021

Re-ranking for image retrieval and transductive few-shot classification

Xi SHEN, Yang Xiao, Shell Hu and
Othman Sbai, Mathieu Aubry

Keywords Paper

machine learning, graph learning, meta learning, few shot learning

0

0

0

0

5:46

07/09/2020

Image Harmonization with Attention-based Deep Feature Modulation

Guoqing Hao, Satoshi Iizuka, Kazuhiro Fukui

Keywords Paper

image harmonization, feature map modulation, attention

0

0

0

0

5:03

02/02/2021

Classification by Attention: Scene Graph Classification with Prior Knowledge

Sahand Sharifzadeh, Sina Moayed Baharlou, Volker Tresp

Keywords Paper

0

0

0

0

17:04

14/09/2020

On Saliency Maps and Adversarial Robustness

Puneet Mangla, Vedant Singh, Vineeth Balasubramanian

Keywords Paper

adversarial robustness, saliency maps, deep neural networks

0

0

0

0

17:29

05/01/2021

Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval

Andres Mafla, Sounak Dey, Ali Furkan Biten and
Lluis Gomez, Dimosthenis Karatzas

Keywords Paper

0

0

0

0

4:59

03/05/2021

Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images

Rewon Child

Keywords Paper

VAE, deep learning, likelihood-based models, generative modeling

0

0

0

0

12:28

22/11/2021

MAGECally invert images for realistic editing

Asya Grechka, jean Francois Goudou, Matthieu Cord

Keywords Paper

gan inversion, gan, stylegan2, gan editing, image editing, gan projection, stylegan, semantic editing, latent space manipulation, latent editing

0

0

0

0

3:01

14/06/2020

Squeeze-and-Attention Networks for Semantic Segmentation

Zilong Zhong, Zhong Qiu Lin, Rene Bidart and
Xiaodan Hu, Ibrahim Ben Daya, Zhifeng Li, Wei-Shi Zheng, Jonathan Li, Alexander Wong

Keywords Paper

semantic segmentation, squeeze-and-attention, pixel grouping

0

0

0

0

1:01

17/08/2020

Langevin monte carlo rendering with gradient-based adaptation

Fujun Luan, Shuang Zhao, Kavita Bala, Ioannis Gkioulekas

Keywords Paper

global illumination, langevin Monte Carlo, photorealistic rendering

0

0

0

0

17:36

05/01/2021

Foreground Color Prediction Through Inverse Compositing

Sebastian Lutz, Aljosa Smolic

Keywords Paper

0

0

0

0

4:51

14/06/2020

SCOUT: Self-Aware Discriminant Counterfactual Explanations

Pei Wang, Nuno Vasconcelos

Keywords Paper

counterfactual explanation, confidence, machine teaching, explainable ai, fine-grained recognition

0

0

0

0

1:01

06/12/2021

Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis

Thomas FEL, Remi Cadene, Mathieu Chalvidal and
Matthieu Cord, David Vigouroux, Thomas Serre

Keywords Paper

deep learning, machine learning, vision, interpretability

0

0

0

0

14:17

06/12/2021

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare and
Shafiq Joty, Caiming Xiong, Steven Chu Hong Hoi

Keywords Paper

transformers, vision, representation learning

0

0

0

0

9:40

06/12/2021

Learning Generative Vision Transformer with Energy-Based Latent Space for Saliency Prediction

Jing Zhang, Jianwen Xie, Nick Barnes, Ping Li

Keywords Paper

transformers, vision, generative model

0

0

0

0

12:02

02/02/2021

A Simple and Effective Self-Supervised Contrastive Learning Framework for Aspect Detection

Tian Shi, Liuqing Li, Ping Wang, Chandan K. Reddy

Keywords Paper

0

0

0

0

19:21

22/11/2021

ExSinGAN: Learning an Explainable Generative Model from a Single Image

Zicheng Zhang, Congying Han, Tiande Guo

Keywords Paper

single image generation, single image generative model, generative adversarial network, image synthesis

0

0

0

0

3:00

06/12/2021

Passive attention in artificial neural networks predicts human visual selectivity

Thomas Langlois, Haicheng Zhao, Erin Grant and
Ishita Dasgupta, Tom Griffiths, Nori Jacoby

Keywords Paper

deep learning, machine learning, vision, interpretability

0

0

0

0

19:23

05/01/2021

EVET: Enhancing Visual Explanations of Deep Neural Networks Using Image Transformations

Youngrock Oh, Hyungsik Jung, Jeonghyung Park, Min Soo Kim

Keywords Paper

0

0

0

0

4:59

18/07/2021

Multiscale Invertible Generative Networks for High-Dimensional Bayesian Inference

Shumao Zhang, Pengchuan Zhang, Thomas Hou

Keywords Paper

Deep Learning, Generative Models

0

0

0

0

5:01

22/11/2021

Cross-Modal Generative Augmentation for Visual Question Answering

Zixu Wang, Yishu Miao, Lucia Specia

Keywords Paper

visual question answering, data augmentation, generative model, multimodal machine learning

0

0

0

0

2:49

17/08/2020

Compositional neural scene representations for shading inference

Jonathan Granskog, Fabrice Rousselle, Marios Papas, Jan Novák

Keywords Paper

disentanglement, neural scene representations, attribution, rendering, neural networks

0

0

0

0

19:11

14/06/2020

Normalized and Geometry-Aware Self-Attention Network for Image Captioning

Longteng Guo, Jing Liu, Xinxin Zhu and
Peng Yao, Shichen Lu, Hanqing Lu

Keywords Paper

image captioning, self-attention, transformer, vision and language, vqa, video captioning, machine translation, ng-san, san, sa

0

0

0

0

1:01

07/09/2020

RankPose: Learning Generalised Feature with Rank Supervision for Head Pose Estimation

Donggen Dai, Wangkit Wong, Zhuojun Chen

Keywords Paper

head pose estimation, rank supervision, arccos transformation, ranking loss

0

0

0

0

6:52

14/06/2020

Learning Fused Pixel and Feature-Based View Reconstructions for Light Fields

Jinglei Shi, Xiaoran Jiang, Christine Guillemot

Keywords Paper

light field, view synthesis, feature-based reconstruction, pixel-based reconstruction, deep learning, angular super-resolution

0

0

0

0

4:56

06/12/2021

Inverse Problems Leveraging Pre-trained Contrastive Representations

Sriram Ravula, Georgios Smyrnis, Matt Jordan, Alexandros Dimakis

Keywords Paper

robustness, contrastive learning, representation learning

0

0

0

0

11:40

14/06/2020

End-to-End Illuminant Estimation Based on Deep Metric Learning

Bolei Xu, Jingxin Liu, Xianxu Hou and
Bozhi Liu, Guoping Qiu

Keywords Paper

color constancy, illuminant estimation, deep metric learning, end-to-end

0

0

0

0

1:00

18/07/2021

Generalized Doubly Reparameterized Gradient Estimators

Matthias Bauer, Andriy Mnih

Keywords Paper

Probabilistic Methods, Approximate Inference

0

0

0

0

5:30

30/11/2020

Learning More Accurate Features for Semantic Segmentation in CycleNet

Linzi Qu, Lihuo He, JunJie Ke and
Xinbo Gao, Wen Lu

Keywords Paper

0

0

0

0

6:18

18/11/2020

CCA-flow: Deep multi-view subspace learning with inverse autoregressive flow

Jia He, Feiyang Pan, Fuzhen Zhuang, Qing He

Keywords Paper

0

0

0

0

11:33

02/02/2021

Similarity Reasoning and Filtration for Image-Text Matching

Haiwen Diao, Ying Zhang, Lin Ma, Huchuan Lu

Keywords Paper

0

0

0

0

16:34

02/02/2021

Scene Graph Embeddings Using Relative Similarity Supervision

Paridhi Maheshwari, Ritwick Chaudhry, Vishwa Vinay

Keywords Paper

0

0

0

0

18:53

18/07/2021

DANCE: Enhancing saliency maps using decoys

Yang Lu, Wenbo Guo, Xinyu Xing, William Stafford Noble

Keywords Paper

Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

5:36

14/06/2020

Exploring Self-Attention for Image Recognition

Hengshuang Zhao, Jiaya Jia, Vladlen Koltun

Keywords Paper

self-attention, pairwise, patchwise, vector attention, image recognition

0

0

0

0

1:02