Show, Edit and Tell: A Framework for Editing Image Captions

14/06/2020

Show, Edit and Tell: A Framework for Editing Image Captions

Fawaz Sammani, Luke Melas-Kyriazi

Keywords: image captioning, image description, editing captions, sequence editing, copy mechanism, adaptive copy mechanism, selecting mechanism, copy lstm

Abstract Paper Similar Papers

Abstract: Most image captioning frameworks generate captions directly from images, learning a mapping from visual features to natural language. However, editing existing captions can be easier than generating new ones from scratch. Intuitively, when editing captions, a model is not required to learn information that is already present in the caption (i.e. sentence structure), enabling it to focus on fixing details (e.g. replacing repetitive words). This paper proposes a novel approach to image captioning based on iterative adaptive refinement of an existing caption. Specifically, our caption-editing model consisting of two sub-modules: (1) EditNet, a language module with an adaptive copy mechanism (Copy-LSTM) and a Selective Copy Memory Attention mechanism (SCMA), and (2) DCNet, an LSTM-based denoising auto-encoder. These components enable our model to directly copy from and modify existing captions. Experiments demonstrate that our new approach achieves state of-art performance on the MS COCO dataset both with and without sequence-level training.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CVPR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Swapping Autoencoder for Deep Image Manipulation

Taesung Park, Jun-Yan Zhu, Oliver Wang and
Jingwan Lu, Eli Shechtman, Alexei Efros, Richard Zhang

Keywords Paper

0

0

0

0

3:20

14/06/2020

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen and
Shai Mazor, Roee Litman

Keywords Paper

gan, semi-supervised, domain-adaptation, handwriting, generative, unlabeled, transfer learning, ocr, text, augmentation

0

0

0

0

1:01

14/06/2020

Bringing Old Photos Back to Life

Ziyu Wan, Bo Zhang, Dongdong Chen and
Pan Zhang, Dong Chen, Jing Liao, Fang Wen

Keywords Paper

image restoration, low-level vision, image translation

0

0

0

0

4:41

22/11/2021

Each Attribute Matters: Contrastive Attention for Sentence-based Image Editing

Liuqing Zhao, Fan Lyu, Fuyuan Hu and
Kaizhu Huang, Fenglei Xu, Linyan Li

Keywords Paper

Image manipulation, Generation adversarial network

0

0

0

0

3:10

06/12/2021

EditGAN: High-Precision Semantic Image Editing

Huan Ling, Karsten Kreis, Daiqing Li and
Seung Wook Kim, Antonio Torralba, Sanja Fidler

Keywords Paper

optimization, vision, generative model

0

0

0

0

11:28

14/06/2020

ManiGAN: Text-Guided Image Manipulation

Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip H.S. Torr

Keywords Paper

image manipulation, natural language, generative adversarial networks, gan

0

0

0

0

1:01

30/11/2020

Image Captioning through Image Transformer

Sen He, Wentong Liao, Hamed R. Tavakoli and
Michael Yang, Bodo Rosenhahn, Nicolas Pugeault

Keywords Paper

0

0

0

0

9:49

22/11/2021

SuperStyleNet: Deep Image Synthesis with Superpixel Based Style Encoder

Jonghyun Kim, Gen Li, Cheolkon Jung, Joongkyu Kim

Keywords Paper

image-to-image translation, semantic image synthesis, image generation, superpixel, style encoder, graph self-attention

0

0

0

0

2:52

06/12/2020

Generative View Synthesis: From Single-view Semantics to Novel-view Images

Tewodros Amberbir Habtegebrial, Varun Jampani, Orazio Gallo, Didier Stricker

Keywords Paper

0

0

0

0

3:20

02/02/2021

Self-Supervised Sketch-to-Image Synthesis

Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal

Keywords Paper

0

0

0

0

14:42

12/07/2020

On Variational Learning of Controllable Representations for Text without Supervision

Peng Xu, Jackie Chi Kit Cheung, Yanshuai Cao

Keywords Paper

Representation Learning

0

0

0

0

14:51

30/11/2020

DeepVoxels++: Enhancing the Fidelity of Novel View Synthesis from 3D Voxel Embeddings

Tong He, John Collomosse, Hailin Jin, Stefano Soatto

Keywords Paper

0

0

0

0

7:47

22/11/2021

Separating Content and Style for Unsupervised Image-to-Image Translation

Yunfei Liu, Haofei Wang, Yang Yue, Feng Lu

Keywords Paper

Image-to-Image Translation, unsupervised learning, CNN Interpretation

0

0

0

0

2:46

14/06/2020

Interpreting the Latent Space of GANs for Semantic Face Editing

Yujun Shen, Jinjin Gu, Xiaoou Tang, Bolei Zhou

Keywords Paper

generative adversarial network, network interpretation, face editing

0

0

0

0

1:01

14/06/2020

MaskGAN: Towards Diverse and Interactive Facial Image Manipulation

Cheng-Han Lee, Ziwei Liu, Lingyun Wu, Ping Luo

Keywords Paper

facial image manipulation, face segmentation, image synthesis, generative adversarial network

0

0

0

0

1:00

06/12/2021

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare and
Shafiq Joty, Caiming Xiong, Steven Chu Hong Hoi

Keywords Paper

transformers, vision, representation learning

0

0

0

0

9:40

05/01/2021

Intra-Class Part Swapping for Fine-Grained Image Classification

Lianbo Zhang, Shaoli Huang, Wei Liu

Keywords Paper

0

0

0

0

4:43

22/11/2021

MAGECally invert images for realistic editing

Asya Grechka, jean Francois Goudou, Matthieu Cord

Keywords Paper

gan inversion, gan, stylegan2, gan editing, image editing, gan projection, stylegan, semantic editing, latent space manipulation, latent editing

0

0

0

0

3:01

06/12/2020

Diverse Image Captioning with Context-Object Split Latent Spaces

Shweta Mahajan, Stefan Roth

Keywords Paper

0

0

0

0

3:19

14/06/2020

Semantic Image Manipulation Using Scene Graphs

Helisa Dhamo, Azade Farshad, Iro Laina and
Nassir Navab, Gregory D. Hager, Federico Tombari, Christian Rupprecht

Keywords Paper

image manipulation, semantic manipulation, scene graphs, image generation, generative adversarial networks, gans, gcns, image editing, removal, inpainting

0

0

0

0

1:01

26/04/2020

Decoupling Representation and Classifier for Long-Tailed Recognition

Bingyi Kang, Saining Xie, Marcus Rohrbach and
Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

Keywords Paper

long-tailed recognition, classification

0

0

0

1

5:00

02/02/2021

Deep Semantic Dictionary Learning for Multi-label Image Classification

Fengtao Zhou, Sheng Huang, Yun Xing

Keywords Paper

0

0

0

0

15:06

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

06/12/2021

MST: Masked Self-Supervised Transformer for Visual Representation

Zhaowen Li, Zhiyang Chen, Fan Yang and
Wei Li, Yousong Zhu, Chaoyang Zhao, Rui Deng, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang

Keywords Paper

self-supervised learning, transformers, vision, language

0

0

0

0

7:13

06/12/2020

One-sample Guided Object Representation Disassembling

Zunlei Feng, Yongming He, Xinchao Wang and
Xin Gao, Jie Lei, Cheng Jin, Mingli Song

Keywords Paper

Deep Learning -> Efficient Inference Methods, Deep Learning

0

0

0

0

3:24

03/05/2021

Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and Novel-View Synthesis

Zhipeng Bao, Yu-Xiong Wang, Martial Hebert

Keywords Paper

adversarial training, computer vision, object recognition, few-shot learning, generative models

0

0

0

0

5:11

05/01/2021

Automatic Object Recoloring Using Adversarial Learning

Siavash Khodadadeh, Saeid Motiian, Zhe Lin and
Ladislau Boloni, Shabnam Ghadar

Keywords Paper

0

0

0

0

4:43

02/02/2021

Learning Intact Features by Erasing-Inpainting for Few-shot Classification

Junjie Li, Zilei Wang, Xiaoming Hu

Keywords Paper

0

0

0

0

15:15

02/02/2021

Robust PDF Document Conversion using Recurrent Neural Networks

Nikolaos Livathinos, Cesar Berrospi, Maksym Lysak and
Viktor Kuropiatnyk, Ahmed Nassar, Andre Carvalho, Michele Dolfi, Christoph Auer, Kasper Dinkla, Peter Staar

Keywords Paper

0

0

0

0

20:33

14/06/2020

A U-Net Based Discriminator for Generative Adversarial Networks

Edgar Schönfeld, Bernt Schiele, Anna Khoreva

Keywords Paper

gan, image synthesis, u-net, discriminator, consistency regularization, equivariance, generative adversarial networks, ffhq, biggan

0

0

0

0

1:01

05/01/2021

Foreground Color Prediction Through Inverse Compositing

Sebastian Lutz, Aljosa Smolic

Keywords Paper

0

0

0

0

4:51

19/08/2021

Context-Aware Image Inpainting with Learned Semantic Priors

Wendong Zhang, Junwei Zhu, Ying Tai and
Yunbo Wang, Wenqing Chu, Bingbing Ni, Chengjie Wang, Xiaokang Yang

Keywords Paper

Computer Vision, 2D and 3D Computer Vision, Deep Learning

0

0

0

0

13:26

16/11/2020

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Xiaomian Kang, Yang Zhao, Jiajun Zhang, Chengqing Zong

Keywords Paper

document-level translation, translations, document-level model, selection module

0

0

0

0

11:36

04/07/2020

Multimodal Transformer for Multimodal Machine Translation

Shaowei Yao, Xiaojun Wan

Keywords Paper

Multimodal MMT, Multimodal, MMT, representation images

1

0

0

0

5:11

30/11/2020

Show, Conceive and Tell: Image Captioning with Prospective Linguistic Information

Yiqing Huang, Jiansheng Chen

Keywords Paper

0

0

0

0

7:08

02/02/2021

Object-Centric Image Generation from Layouts

Tristan Sylvain, Pengchuan Zhang, Yoshua Bengio and
R Devon Hjelm, Shikhar Sharma

Keywords Paper

0

0

0

0

17:44

19/08/2021

ALaSca: an Automated approach for Large-Scale Lexical Substitution

Caterina Lacerra, Tommaso Pasini, Rocco Tripodi, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:27

26/04/2020

Neural Machine Translation with Universal Visual Representation

Zhuosheng Zhang, Kehai Chen, Rui Wang and
Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Keywords Paper

Neural Machine Translation, Visual Representation, Multimodal Machine Translation, Language Representation

0

0

0

0

4:50

26/04/2020

Controlling generative models with continuous factors of variations

Antoine Plumerault, Hervé Le Borgne, Céline Hudelot

Keywords Paper

Generative models, factor of variation, GAN, beta-VAE, interpretable representation, interpretability

0

0

0

0

5:07

03/05/2021

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Yanru Qu, Dinghan Shen, Yelong Shen and
Sandra Sajeev, Weizhu Chen, Jiawei Han

Keywords Paper

consistency training, contrastive learning, data augmentation, natural language understanding

0

0

0

0

6:02