Partially Non-Autoregressive Image Captioning

02/02/2021

Partially Non-Autoregressive Image Captioning

Zhengcong Fei

Keywords:

Abstract Paper Similar Papers

Abstract: Current state-of-the-art image captioning systems usually generated descriptions autoregressively, i.e., every forward step conditions on the given image and previously produced words. The sequential attribution causes a unavoidable decoding latency. Non-autoregressive image captioning, on the other hand, predicts the entire sentence simultaneously and accelerates the inference process significantly. However, it removes the dependence in a caption and commonly suffers from repetition or missing issues. To make a better trade-off between speed and quality, we introduce a partially non-autoregressive model, named PNAIC, which considers a caption as a series of concatenated word groups. The groups are generated parallelly in global while each word in group is predicted from left to right, and thus the captioner can create multiple discontinuous words concurrently at each time step. More importantly, by incorporating curriculum learning-based training tasks of group length prediction and invalid group deletion, our model is capable of generating accurate captions as well as preventing common incoherent errors. Extensive experiments on MS COCO benchmark demonstrate that our proposed method achieves more than 3.5× speedup while maintaining competitive performance.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38947873

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

22/09/2020

MEANTIME: Mixture of attention mechanisms with multi-temporal embeddings for sequential recommendation

Sung Min Cho, Eunhyeok Park, Sungjoo Yoo

Keywords Paper

Self-attention, Sequential Recommendation, Temporal Embedding, BERT

0

0

0

0

3:10

06/12/2021

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare and
Shafiq Joty, Caiming Xiong, Steven Chu Hong Hoi

Keywords Paper

transformers, vision, representation learning

0

0

0

0

9:40

14/06/2020

IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval

Hui Chen, Guiguang Ding, Xudong Liu and
Zijia Lin, Ji Liu, Jungong Han

Keywords Paper

cross-modal image text retrieval, iterative matching, recurrent attention memory

0

0

0

0

1:04

05/01/2021

Intra-Class Part Swapping for Fine-Grained Image Classification

Lianbo Zhang, Shaoli Huang, Wei Liu

Keywords Paper

0

0

0

0

4:43

06/12/2020

Unsupervised Representation Learning by Invariance Propagation

Feng Wang, Huaping Liu, Di Guo, Sun Fuchun

Keywords Paper

0

0

0

0

3:11

02/02/2021

Semantic Grouping Network for Video Captioning

Hobin Ryu, Sunghun Kang, Haeyong Kang, Chang D. Yoo

Keywords Paper

0

0

0

0

17:41

16/11/2020

Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference

Jianguo Zhang, Kazuma Hashimoto, Wenhao Liu and
Chien-Sheng Wu, Yao Wan, Philip Yu, Richard Socher, Caiming Xiong

Keywords Paper

intent detection, detecting intents, oos detection, large-scale task

0

0

0

0

11:43

26/04/2020

Decoupling Representation and Classifier for Long-Tailed Recognition

Bingyi Kang, Saining Xie, Marcus Rohrbach and
Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

Keywords Paper

long-tailed recognition, classification

0

0

0

1

5:00

06/12/2021

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling

Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Keywords Paper

contrastive learning

0

0

0

0

10:12

19/08/2021

Dependent Multi-Task Learning with Causal Intervention for Image Captioning

Wenqing Chen, Jidong Tian, Caoyun Fan and
Hao He, Yaohui Jin

Keywords Paper

Machine Learning, Transfer, Adaptation, Multi-task Learning, Natural Language Generation, Language and Vision

0

0

0

0

12:02

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

03/05/2021

Support-set bottlenecks for video-text representation learning

Mandela Patrick, Po-Yao Huang, Yuki Asano and
Florian Metze, Alexander G Hauptmann, Joao F. Henriques, Andrea Vedaldi

Keywords Paper

contrastive learning, video-text learning, multi-modal learning, video representation learning

0

0

0

0

6:40

22/11/2021

Inter-intra Variant Dual Representations for Self-supervised Video Recognition

Lin ZHANG, Qi She, Zhengyang Shen, Changhu Wang

Keywords Paper

video action recognition, self-supervised learning, contrastive learning, representation learning

0

0

0

0

2:55

02/02/2021

Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection

Alexander Podolskiy, Dmitry Lipin, Andrey Bout and
Ekaterina Artemova, Irina Piontkovskaya

Keywords Paper

0

0

0

0

16:08

16/11/2020

The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection

Zibo Lin, Deng Cai, Yan Wang and
Xiaojiang Liu, Haitao Zheng, Shuming Shi

Keywords Paper

response selection, retrieval-based systems, learning-to-rank problem, learning-to-rank

0

0

0

0

12:03

04/07/2020

The Summary Loop: Learning to Write Abstractive Summaries Without Examples

Philippe Laban, Andrew Hsi, John Canny, Marti A. Hearst

Keywords Paper

unsupervised summarization, coverage model, unsupervised procedure, fluency model

0

0

0

0

11:54

02/02/2021

Train a One-Million-Way Instance Classifier for Unsupervised Visual Representation Learning

Yu Liu, Lianghua Huang, Pan Pan and
Bin Wang, Yinghui Xu, Rong Jin

Keywords Paper

0

0

0

0

15:15

16/11/2020

MODE-LSTM: A Parameter-efficient Recurrent Network with Multi-Scale for Sentence Classification

Qianli Ma, Zhenxi Lin, Jiangyue Yan and
Zipeng Chen, Liuhong Yu

Keywords Paper

sentence classification, extracting features, generalization, cnn models

0

0

0

0

10:35

02/02/2021

C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

Yutai Hou, Sanyuan Chen, Wanxiang Che and
Cheng Chen, Ting Liu

Keywords Paper

0

0

0

0

15:01

03/05/2021

Disentangled Recurrent Wasserstein Autoencoder

Jun Han, Martin Min, Ligong Han and
Li Erran Li, Xuan Zhang

Keywords Paper

Recurrent Generative Model, Sequential Representation Learning, Disentanglement

0

0

0

0

9:17

03/05/2021

Rethinking Positional Encoding in Language Pre-training

Guolin Ke, Di He, Tie-Yan Liu

Keywords Paper

Natural Language Processing, Pre-training

0

0

0

0

4:49

06/12/2020

SMYRF - Efficient Attention using Asymmetric Clustering

Giannis Daras, Nikita Kitaev, Augustus Odena, Alex Dimakis

Keywords Paper

0

0

0

0

3:28

16/11/2020

Small but Mighty: New Benchmarks for Split and Rephrase

Li Zhang, Huaiyu Zhu, Siddhartha Brahma, Yunyao Li

Keywords Paper

text task, fine-grained evaluation, automatic process, rule-based model

0

0

0

0

6:58

16/11/2020

A Multi-Task Incremental Learning Framework with Category Name Embedding for Aspect-Category Sentiment Analysis

Zehui Dai, Cheng Peng, Huajie Chen, Yadong Ding

Keywords Paper

tacsa tasks, aspect-category analysis, targeted analysis, multi-task learning

0

0

0

0

11:41

06/12/2021

BooVAE: Boosting Approach for Continual Learning of VAE

Evgenii Egorov, Anna Kuzina, Evgeny Burnaev

Keywords Paper

self-supervised learning, generative model, continual learning

0

0

0

0

8:54

02/02/2021

Object Relation Attention for Image Paragraph Captioning

Li-Chuan Yang, Chih-Yuan Yang, Jane Yung-jen Hsu

Keywords Paper

0

0

0

0

15:03

26/04/2020

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

energy-based models, text generation

0

0

0

0

4:59

12/07/2020

On Variational Learning of Controllable Representations for Text without Supervision

Peng Xu, Jackie Chi Kit Cheung, Yanshuai Cao

Keywords Paper

Representation Learning

0

0

0

0

14:51

06/12/2021

MERLOT: Multimodal Neural Script Knowledge Models

Rowan Zellers, Ximing Lu, Jack Hessel and
Youngjae Yu, Jae Sung Park, Jize Cao, Ali Farhadi, Yejin Choi

Keywords Paper

representation learning

0

0

0

0

18:15

08/12/2020

A Semantically Consistent and Syntactically Variational Encoder-Decoder Framework for Paraphrase Generation

Wenqing Chen, Jidong Tian, Liqiang Xiao and
Hao He, Yaohui Jin

Keywords Paper

0

0

0

0

14:50

18/11/2020

Bidirectional dependency-guided attention for relation extraction

Xingchen Deng, Lei Zhang, Yixing Fan and
Long Bai, Jiafeng Guo, Pengfei Wang

Keywords Paper

0

0

0

0

10:02

30/11/2020

Show, Conceive and Tell: Image Captioning with Prospective Linguistic Information

Yiqing Huang, Jiansheng Chen

Keywords Paper

0

0

0

0

7:08

22/11/2021

From Seq2Seq Recognition to Handwritten Word Embeddings

George Retsinas, Giorgos Sfikas, Christophoros Nikou, Petros Maragos

Keywords Paper

keyword spotting, handwritten text recognition, sequence-to-sequence

0

0

0

0

2:59

02/02/2021

Adaptive Beam Search Decoding for Discrete Keyphrase Generation

Xiaoli Huang, Tongge Xu, Lvan Jiao and
Yueran Zu, Youmin Zhang

Keywords Paper

0

0

0

0

14:36

14/06/2020

Learning Selective Self-Mutual Attention for RGB-D Saliency Detection

Nian Liu, Ni Zhang, Junwei Han

Keywords Paper

rgb-d saliency detection, middle fusion, self-attention, mutual-attention, non-local network, two-stream cnn

0

0

0

0

1:01

04/07/2020

Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation

Qiu Ran, Yankai Lin, Peng Li, Jie Zhou

Keywords Paper

Non-Autoregressive Translation, Non-Autoregressive , inference process, multi-modality problem

0

0

0

0

8:34

06/12/2020

A causal view of compositional zero-shot recognition

Yuval Atzmon, Felix Kreuk, Uri Shalit, Gal Chechik

Keywords Paper

0

0

0

0

3:22

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

03/05/2021

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

Beliz Gunel, Jingfei Du, Alexis Conneau, Veselin Stoyanov

Keywords Paper

supervised contrastive learning, pre-trained language model fine-tuning, natural language understanding, generalization, few-shot learning, robustness

0

0

0

0

4:44

04/07/2020

Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning

Alexandre Tamborrino, Nicola Pellicanò, Baptiste Pannier and
Pascal Voitot, Louise Naudin

Keywords Paper

Commonsense Reasoning, common tasks, plausibility task, pre-training phase

0

0

0

0

11:39