Leveraging Human Attention in Novel Object Captioning

19/08/2021

Leveraging Human Attention in Novel Object Captioning

Xianyu Chen, Ming Jiang, Qi Zhao

Keywords: Computer Vision, Language and Vision

Abstract Paper Similar Papers

Abstract: Image captioning models depend on training with paired image-text corpora, which poses various challenges in describing images containing novel objects absent from the training data. While previous novel object captioning methods rely on external image taggers or object detectors to describe novel objects, we present the Attention-based Novel Object Captioner (ANOC) that complements novel object captioners with human attention features that characterize generally important information independent of tasks. It introduces a gating mechanism that adaptively incorporates human attention with self-learned machine attention, with a Constrained Self-Critical Sequence Training method to address the exposure bias while maintaining constraints of novel object descriptions. Extensive experiments conducted on the nocaps and Held-Out COCO datasets demonstrate that our method considerably outperforms the state-of-the-art novel object captioners. Our source code is available at https://github.com/chenxy99/ANOC.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at IJCAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Diverse Image Captioning with Context-Object Split Latent Spaces

Shweta Mahajan, Stefan Roth

Keywords Paper

0

0

0

0

3:19

30/11/2020

Any-Shot Object Detection

Shafin Rahman, Salman Khan, Nick Barnes, Fahad Shahbaz Khan

Keywords Paper

0

0

0

0

8:46

22/11/2021

Simpler Does It: Generating Semantic Labels with Objectness Guidance

Md Amirul Islam, Matthew Kowal, Sen Jia and
Konstantinos Derpanis, Neil Bruce

Keywords Paper

Weakly supervised segmentation, semi supervised segmentation, Pseudo-label generation, Class Activation Maps, Objectness, Saliency

0

0

0

0

3:02

19/04/2021

ECOL-R: Encouraging copying in novel object captioning with reinforcement learning

Yufei Wang, Ian Wood, Stephen Wan, Mark Johnson

Keywords Paper

0

0

0

0

12:38

06/12/2020

Restoring Negative Information in Few-Shot Object Detection

Yukuan Yang, Fangyun Wei, Miaojing Shi, Guoqi Li

Keywords Paper

0

0

0

0

3:24

26/04/2020

Decoupling Representation and Classifier for Long-Tailed Recognition

Bingyi Kang, Saining Xie, Marcus Rohrbach and
Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

Keywords Paper

long-tailed recognition, classification

0

0

0

1

5:00

07/09/2020

A Novel Baseline for Zero-shot Learning via Adversarial Visual-Semantic Embedding

Yu Liu, Tinne Tuytelaars

Keywords Paper

zero-shot learning, generalized zero-shot learning, visual-semantic embedding, adversarial learning, image synthesis

0

0

0

0

9:57

02/02/2021

Generalized Zero-Shot Learning via Disentangled Representation

Xiangyu Li, Zhe Xu, Kun Wei, Cheng Deng

Keywords Paper

0

0

0

0

12:10

14/06/2020

DLWL: Improving Detection for Lowshot Classes With Weakly Labelled Data

Vignesh Ramanathan, Rui Wang, Dhruv Mahajan

Keywords Paper

detection, lowshot, weak supervision, linear program, constraint, web-scale data, lvis0.5

0

0

0

0

1:01

26/04/2020

Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation

Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang, Ming-Hsuan Yang

Keywords Paper

0

0

0

0

4:24

12/07/2020

XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning

Sung Whan Yoon, Jun Seo, Doyeon Kim, Jaekyun Moon

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

14:59

14/06/2020

Attack to Explain Deep Representation

Mohammad A. A. K. Jalwana, Naveed Akhtar, Mohammed Bennamoun, Ajmal Mian

Keywords Paper

interpreting deep learning, adversarial attack, explanation attack, explainable ai, image generation

0

0

0

0

1:01

06/12/2021

Towards Context-Agnostic Learning Using Synthetic Data

Charles Jin, Martin Rinard

Keywords Paper

machine learning, vision

0

0

0

0

14:20

05/01/2021

Where to Look?: Mining Complementary Image Regions for Weakly Supervised Object Localization

Sadbhavana Babar, Sukhendu Das

Keywords Paper

0

0

0

0

5:01

19/08/2021

Cross-Domain Few-Shot Classification via Adversarial Task Augmentation

Haoqing Wang, Zhi-Hong Deng

Keywords Paper

Computer Vision, Recognition, Adversarial Machine Learning, Deep Learning

0

0

0

0

10:39

30/11/2020

OpenGAN: Open Set Generative Adversarial Networks

Luke Ditria, Benjamin J. Meyer, Tom Drummond

Keywords Paper

0

0

1

1

10:09

03/05/2021

Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and Novel-View Synthesis

Zhipeng Bao, Yu-Xiong Wang, Martial Hebert

Keywords Paper

adversarial training, computer vision, object recognition, few-shot learning, generative models

0

0

0

0

5:11

07/09/2020

BriNet: Towards Bridging the Intra-class and Inter-class Gaps in One-Shot Segmentation

Xianghui Yang, Bairun Wang, Xinchi Zhou and
Kaige Chen, Shuai Yi, Wanli Ouyang, Luping Zhou

Keywords Paper

Few-shot Semantic Segmentation, Few-shot learning, Semantic Segmentation

0

0

0

0

8:26

22/11/2021

Rich Semantics Improve Few-Shot Learning

Mohamed Afham Mohamed Aflal, Salman Khan, Muhammad Haris Khan and
Muzammal Naseer, Fahad Shahbaz Khan

Keywords Paper

few shot learning, multimodal learning, transformers in vision

0

0

0

0

2:47

22/11/2021

Model Composition: Can Multiple Neural Networks Be Combined into a Single Network Using Only Unlabeled Data?

Amin Banitalebi-Dehkordi, Xinyu Kang, Yong Zhang

Keywords Paper

Model Composition, Combining Neural Networks, Pseudo Label, Self Training, Label Aggregation, Combining Models

0

0

0

0

2:58

05/01/2021

Towards Contextual Learning in Few-Shot Object Classification

Mathieu Page Fortin, Brahim Chaib-draa

Keywords Paper

0

0

0

0

4:57

06/12/2021

Meta Internal Learning

Raphael Bensadoun, Shir Gur, Tomer Galanti, Lior Wolf

Keywords Paper

vision, generative model, meta learning

0

0

0

0

7:41

06/12/2021

Designing Counterfactual Generators using Deep Model Inversion

Jayaraman Thiagarajan, Vivek Sivaraman Narayanaswamy, Deepta Rajan and
Jia Liang, Akshay Chaudhari, Andreas Spanias

Keywords Paper

optimization, representation learning, interpretability

0

0

0

0

13:28

22/11/2021

Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains

Shivam Chandhok, Sanath Narayan, Hisham Cholakkal and
Rao Muhammad Anwer, Vineeth N Balasubramanian, Fahad Shahbaz Khan, Ling Shao

Keywords Paper

Zero-Shot, Domain Generalization, multimodal-alignment, domain-invariant, conceptual partition, semantics

0

0

0

0

2:48

06/12/2021

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling

Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Keywords Paper

contrastive learning

0

0

0

0

10:12

14/06/2020

Self-Supervised Scene De-Occlusion

Xiaohang Zhan, Xingang Pan, Bo Dai and
Ziwei Liu, Dahua Lin, Chen Change Loy

Keywords Paper

de-occlusion, self-supervised, occlusion ordering, scene understanding, amodal completion, inpainting, amodal instance segmentation, decomposition, image editing, manipulation

0

0

0

0

4:59

07/09/2020

Meta-RetinaNet for Few-shot Object Detection

Shaoqi Li, Wenfeng Song, Shuai Li and
Aimin Hao, Hong Qin

Keywords Paper

Few shot, object detection, meta-learning, Meta-RetinaNet, Balanced Loss, coefficient vector

0

0

0

0

8:51

05/01/2021

Class-Agnostic Few-Shot Object Counting

Shuo-Diao Yang, Hung-Ting Su, Winston H. Hsu, Wen-Chin Chen

Keywords Paper

0

0

0

0

4:46

14/06/2020

Meshed-Memory Transformer for Image Captioning

Marcella Cornia, Matteo Stefanini, Lorenzo Baraldi, Rita Cucchiara

Keywords Paper

transformer, image captioning, vision and language, fully-attentive models, mesh connectivity, memory vectors, self-attention

0

0

0

0

1:00

02/02/2021

Train a One-Million-Way Instance Classifier for Unsupervised Visual Representation Learning

Yu Liu, Lianghua Huang, Pan Pan and
Bin Wang, Yinghui Xu, Rong Jin

Keywords Paper

0

0

0

0

15:15

14/06/2020

Webly Supervised Knowledge Embedding Model for Visual Reasoning

Wenbo Zheng, Lan Yan, Chao Gou, Fei-Yue Wang

Keywords Paper

visual reasoning, webly supervised learning

0

0

0

0

1:01

19/08/2021

Learning Class-Transductive Intent Representations for Zero-shot Intent Detection

Qingyi Si, Yuanxin Liu, Peng Fu and
Zheng Lin, Jiangnan Li, Weiping Wang

Keywords Paper

Natural Language Processing, Natural Language Processing, Text Classification

0

0

0

0

10:03

06/12/2020

Self-Learning Transformations for Improving Gaze and Head Redirection

Yufeng Zheng, Seonwook Park, Xucong Zhang and
Shalini De Mello, Otmar Hilliges

Keywords Paper

0

0

0

0

3:20

02/02/2021

Attributes-Guided and Pure-Visual Attention Alignment for Few-Shot Recognition

Siteng Huang, Min Zhang, Yachen Kang, Donglin Wang

Keywords Paper

0

0

0

0

17:04

06/12/2020

Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning

Weili Nie, Zhiding Yu, Lei Mao and
Ankit Patel, Yuke Zhu, Anima Anandkumar

Keywords Paper

0

0

0

0

3:23

03/05/2021

Property Controllable Variational Autoencoder via Invertible Mutual Dependence

Xiaojie Guo, Yuanqi Du, Liang Zhao

Keywords Paper

deep generative models, disentangled representation learning, interpretable latent representation

0

0

0

0

4:45

02/02/2021

Task Aligned Generative Meta-learning for Zero-shot Learning

Zhe Liu, Yun Li, Lina Yao and
Xianzhi Wang, Guodong Long

Keywords Paper

0

0

0

0

15:48

19/08/2021

Multi-Target Invisibly Trojaned Networks for Visual Recognition and Detection

Xinzhe Zhou, Wenhao Jiang, Sheng Qi, Yadong Mu

Keywords Paper

Machine Learning, Adversarial Machine Learning

0

0

0

0

14:57

03/05/2021

Grounded Language Learning Fast and Slow

Felix Hill, Olivier Tieleman, Tamara von Glehn and
Nathaniel Wong, Hamza Merzic, Stephen Clark

Keywords Paper

memory, meta-learning, word-learning, grounding, fast-mapping, language, cognition

0

0

0

0

11:44

18/07/2021

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

Jim Fan, Guanzhi Wang, De-An Huang and
Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:13