Learning to Represent Image and Text with Denotation Graph

16/11/2020

Learning to Represent Image and Text with Denotation Graph

Bowen Zhang, Hexiang Hu, Vihan Jain, Eugene Ie, Fei Sha

Keywords: cross-modal retrieval, referring expression, compositional recognition, pre-training

Abstract Paper Similar Papers

Abstract: Learning to fuse vision and language information and representing them is an important research problem with many applications. Recent progresses have leveraged the ideas of pre-training (from language modeling) and attention layers in Transformers to learn representation from datasets containing images aligned with linguistic expressions that describe the images. In this paper, we propose learning representations from a set of implied, visually grounded expressions between image and text, automatically mined from those datasets. In particular, we use denotation graphs to represent how specific concepts (such as sentences describing images) can be linked to abstract and generic concepts (such as short phrases) that are also visually grounded. This type of generic-to-specific relations can be discovered using linguistic analysis tools. We propose methods to incorporate such relations into learning representation. We show that state-of-the-art multimodal learning models can be further improved by leveraging automatically harvested structural relations. The representations lead to stronger empirical results on downstream tasks of cross-modal image retrieval, referring expression, and compositional attribute-object recognition. Both our codes and the extracted denotation graphs on the Flickr30K and the COCO datasets are publically available on https://sha-lab.github.io/DG.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

$i$-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Kibok Lee, Yian Zhu, Kihyuk Sohn and
Chun-Liang Li, Jinwoo Shin, Honglak Lee

Keywords Paper

self-supervised learning, unsupervised representation learning, data augmentation, MixUp, contrastive representation learning

0

0

0

0

5:04

22/11/2021

Rich Semantics Improve Few-Shot Learning

Mohamed Afham Mohamed Aflal, Salman Khan, Muhammad Haris Khan and
Muzammal Naseer, Fahad Shahbaz Khan

Keywords Paper

few shot learning, multimodal learning, transformers in vision

0

0

0

0

2:47

02/02/2021

A Simple and Effective Self-Supervised Contrastive Learning Framework for Aspect Detection

Tian Shi, Liuqing Li, Ping Wang, Chandan K. Reddy

Keywords Paper

0

0

0

0

19:21

14/06/2020

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang

Keywords Paper

data augmentation, text recognition, joint training

0

0

0

0

0:59

03/05/2021

Learning and Evaluating Representations for Deep One-Class Classification

Kihyuk Sohn, Chun-Liang Li, Jinsung Yoon and
Minho Jin, Tomas Pfister

Keywords Paper

self-supervised learning, deep one-class classification

0

0

0

1

5:13

04/07/2020

Representation Learning for Information Extraction from Form-like Documents

Bodhisattwa Prasad Majumder, Navneet Potti, Sandeep Tata and
James Bradley Wendt, Qi Zhao, Marc Najork

Keywords Paper

Information Extraction, extraction task, Representation Learning, extraction system

0

0

0

0

10:58

19/04/2021

Interpretability for morphological inflection: From character-level predictions to subword-level rules

Tatyana Ruzsics, Olga Sozinova, Ximena Gutierrez-Vasques, Tanja Samardzic

Keywords Paper

0

0

0

0

10:53

06/12/2021

Visualizing the Emergence of Intermediate Visual Patterns in DNNs

Mingjie Li, Shaobo Wang, Quanshi Zhang

Keywords Paper

deep learning, adversarial robustness and security

0

0

0

0

7:13

04/07/2020

Cross-Modality Relevance for Reasoning on Language and Vision

Chen Zheng, Quan Guo, Parisa Kordjamshidi

Keywords Paper

Cross-Modality Relevance, Language Vision, visual answering, VQA

0

0

0

0

10:59

18/07/2021

Unifying Vision-and-Language Tasks via Text Generation

Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal

Keywords Paper

Algorithms, Multimodal Learning

0

0

0

0

4:58

02/02/2021

Attributes-Guided and Pure-Visual Attention Alignment for Few-Shot Recognition

Siteng Huang, Min Zhang, Yachen Kang, Donglin Wang

Keywords Paper

0

0

0

0

17:04

14/06/2020

Learning Representations by Predicting Bags of Visual Words

Spyros Gidaris, Andrei Bursuc, Nikos Komodakis and
Patrick Pérez, Matthieu Cord

Keywords Paper

representation learning, self-supervised learning, unsupervised learning, discrete representations, bag of visual words, image understanding, deep learning, convolutional neural networks

0

0

0

0

1:01

04/07/2020

Enhancing Cross-target Stance Detection with Transferable Semantic-Emotion Knowledge

Bowen Zhang, Min Yang, Xutao Li and
Yunming Ye, Xiaofei Xu, Kuai Dai

Keywords Paper

Cross-target Detection, Stance detection, knowledge transfer, stance classifier

0

0

0

0

11:57

16/11/2020

Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Minki Kang, Moonsu Han, Sung Ju Hwang

Keywords Paper

self-supervised pre-training, question answering, task, reinforcement learning

0

0

0

0

12:00

14/06/2020

PointAugment: An Auto-Augmentation Framework for Point Cloud Classification

Ruihui Li, Xianzhi Li, Pheng-Ann Heng, Chi-Wing Fu

Keywords Paper

auto-augmentation framework, point cloud processing, sample-aware, jointly optimizing, classification

0

0

0

0

5:01

06/12/2021

Learning Knowledge Graph-based World Models of Textual Environments

Prithviraj Ammanabrolu, Mark Riedl

Keywords Paper

reinforcement learning and planning, transformers, graph learning, language

0

0

0

0

15:32

03/05/2021

Adaptive and Generative Zero-Shot Learning

Yu-Ying Chou, Hsuan-Tien (Tien) Lin, Tyng-Luh Liu

Keywords Paper

Generalized zero-shot learning, mixup

0

0

0

0

5:18

04/07/2020

Exploiting the Syntax-Model Consistency for Neural Relation Extraction

Amir Pouran Ben Veyseh, Franck Dernoncourt, Dejing Dou, Thien Huu Nguyen

Keywords Paper

Neural Extraction, Relation Extraction, RE, syntactic injection

0

0

0

0

11:03

06/12/2020

Learning to Learn Variational Semantic Memory

Xiantong Zhen, Yingjun Du, Huan Xiong and
Qiang Qiu, Cees Snoek, Ling Shao

Keywords Paper

0

1

1

1

3:24

06/12/2021

Integrating Tree Path in Transformer for Code Representation

Han Peng, Ge Li, Wenhan Wang and
YunFei Zhao, Zhi Jin

Keywords Paper

machine learning, transformers

0

0

0

0

4:42

16/11/2020

SLM: Learning a Discourse Language Representation with Sentence Unshuffling

Haejun Lee, Drew A. Hudson, Kangwook Lee, Christopher D. Manning

Keywords Paper

nlp, sentence-level modeling, discourse representation, pre-training methods

0

0

0

0

9:21

12/07/2020

Learning Structured Latent Factors from Dependent Data:A Generative Model Framework from Information-Theoretic Perspective

Ruixiang ZHANG, Katsuhiko Ishiguro, Masanori Koyama

Keywords Paper

Learning Theory

0

0

0

0

14:46

06/12/2021

Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation

Lin Guan, Mudit Verma, Suna (Sihang) Guo and
Ruohan Zhang, Subbarao Kambhampati

Keywords Paper

reinforcement learning and planning, machine learning

0

0

0

0

13:41

16/11/2020

Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning

Tao Shen, Yi Mao, Pengcheng He and
Guodong Long, Adam Trischler, Weizhu Chen

Keywords Paper

self-supervised tasks, pre-training, entity linking, finetuning

0

0

0

0

11:38

26/04/2020

Progressive learning and disentanglement of hierarchical representations

Zhiyuan Li, Jaideep Vitthal Murkute, Prashnna Kumar Gyawali, Linwei Wang

Keywords Paper

generative model, disentanglement, progressive learning, VAE

0

0

0

0

5:06

03/05/2021

IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning

Manli Zhang, Jianhong Zhang, Zhiwu Lu and
Tao Xiang, Mingyu Ding, Songfang Huang

Keywords Paper

self-supervised learning, few-shot learning, episode-level pretext task

0

0

0

0

5:03

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

06/12/2020

VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain

Jinsung Yoon, Yao Zhang, James Jordon, Mihaela van der Schaar

Keywords Paper

0

0

0

0

3:25

14/09/2020

Learning a Sequence of Sentiment Classification Tasks

Zixuan Ke, Bing Liu, Hao Wang, Lei Shu

Keywords Paper

0

0

0

0

14:23

03/05/2021

Prototypical Representation Learning for Relation Extraction

Ning Ding, Xiaobin Wang, Yao Fu and
Guangwei Xu, Rui Wang, Pengjun Xie, Ying Shen, Fei Huang, Hai-Tao Zheng, Rui Zhang

Keywords Paper

NLP, Representation Learning, Relation Extraction

0

0

0

0

5:14

02/02/2021

Learning a Few-shot Embedding Model with Contrastive Learning

Chen Liu, Yanwei Fu, Chengming Xu and
Siqian Yang, Jilin Li, Chengjie Wang, Li Zhang

Keywords Paper

0

0

0

0

15:02

03/05/2021

Improving Transformation Invariance in Contrastive Representation Learning

Adam Foster, Rattana Pukdee, Tom Rainforth

Keywords Paper

transformation invariance, contrastive learning, representation learning

0

0

0

0

5:23

07/09/2020

Attention Distillation for Learning Video Representations

Miao Liu, Xin Chen, Yun Zhang and
Yin Li, James Rehg

Keywords Paper

Action Recognition, Deep Learning, Representation Learning

0

0

0

0

9:50

05/01/2021

Towards Contextual Learning in Few-Shot Object Classification

Mathieu Page Fortin, Brahim Chaib-draa

Keywords Paper

0

0

0

0

4:57

02/02/2021

Progressive Multi-task Learning with Controlled Information Flow for Joint Entity and Relation Extraction

Kai Sun, Richong Zhang, Samuel Mensah and
Yongyi Mao, Xudong Liu

Keywords Paper

0

0

0

0

13:45

02/02/2021

DeepCollaboration: Collaborative Generative and Discriminative Models for Class Incremental Learning

Bo Cui, Guyue Hu, Shan Yu

Keywords Paper

0

0

0

0

15:13

05/01/2021

Two-Level Adversarial Visual-Semantic Coupling for Generalized Zero-Shot Learning

Shivam Chandhok, Vineeth N Balasubramanian

Keywords Paper

0

0

0

0

4:59

26/04/2020

Automated Relational Meta-learning

Huaxiu Yao, Xian Wu, Zhiqiang Tao and
Yaliang Li, Bolin Ding, Ruirui Li, Zhenhui Li

Keywords Paper

meta-learning, task heterogeneity, meta-knowledge graph

1

1

0

0

5:13

14/06/2020

Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks

Fengda Zhu, Yi Zhu, Xiaojun Chang, Xiaodan Liang

Keywords Paper

computer vision, vision language navigation, reinforcement learning

0

0

0

0

4:25