Reformulating Zero-shot Action Recognition for Multi-label Actions

06/12/2021

Reformulating Zero-shot Action Recognition for Multi-label Actions

Alec Kerrigan, Kevin Duarte, Yogesh Rawat, Mubarak Shah

Keywords: machine learning, vision

Abstract Paper Similar Papers

Abstract: The goal of zero-shot action recognition (ZSAR) is to classify action classes which were not previously seen during training. Traditionally, this is achieved by training a network to map, or regress, visual inputs to a semantic space where a nearest neighbor classifier is used to select the closest target class. We argue that this approach is sub-optimal due to the use of nearest neighbor on static semantic space and is ineffective when faced with multi-label videos - where two semantically distinct co-occurring action categories cannot be predicted with high confidence. To overcome these limitations, we propose a ZSAR framework which does not rely on nearest neighbor classification, but rather consists of a pairwise scoring function. Given a video and a set of action classes, our method predicts a set of confidence scores for each class independently. This allows for the prediction of several semantically distinct classes within one video input. Our evaluations show that our method not only achieves strong performance on three single-label action classification datasets (UCF-101, HMDB, and RareAct), but also outperforms previous ZSAR approaches on a challenging multi-label dataset (AVA) and a real-world surprise activity detection dataset (MEVA).

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Weakly-supervised Temporal Action Localization by Uncertainty Modeling

Pilhyeon Lee, Jinglu Wang, Yan Lu, Hyeran Byun

Keywords Paper

0

0

0

0

14:01

19/08/2021

Self-Supervised Video Action Localization with Adversarial Temporal Transforms

Guoqiang Gong, Liangfeng Zheng, Wenhao Jiang, Yadong Mu

Keywords Paper

Computer Vision, Action Recognition, Video

0

0

0

0

14:39

02/02/2021

Weakly Supervised Temporal Action Localization Through Learning Explicit Subspaces for Action and Context

Ziyi Liu, Le Wang, Wei Tang and
Junsong Yuan, Nanning Zheng, Gang Hua

Keywords Paper

0

0

0

0

19:49

19/08/2021

Reinforcement Learning Based Sparse Black-box Adversarial Attack on Video Recognition Models

Zeyuan Wang, Chaofeng Sha, Su Yang

Keywords Paper

Machine Learning, Adversarial Machine Learning, Applications of Reinforcement Learning

0

0

0

0

12:50

05/01/2021

Only Time Can Tell: Discovering Temporal Data for Temporal Modeling

Laura Sevilla-Lara, Shengxin Zha, Zhicheng Yan and
Vedanuj Goswami, Matt Feiszli, Lorenzo Torresani

Keywords Paper

0

0

0

0

4:14

14/06/2020

Weakly-Supervised Action Localization by Generative Attention Modeling

Baifeng Shi, Qi Dai, Yadong Mu, Jingdong Wang

Keywords Paper

action localization, weakly-supervised, action-context confusion, vae, generative

0

0

0

0

0:58

19/08/2021

Learning Implicit Temporal Alignment for Few-shot Video Classification

Songyang Zhang, Jiale Zhou, Xuming He

Keywords Paper

Computer Vision, Action Recognition, Deep Learning

0

0

0

0

6:20

02/02/2021

Semantic Grouping Network for Video Captioning

Hobin Ryu, Sunghun Kang, Haeyong Kang, Chang D. Yoo

Keywords Paper

0

0

0

0

17:41

22/11/2021

Zero-Shot Action Recognition from Diverse Object-Scene Compositions

Carlo Bretti, Pascal Mettes

Keywords Paper

action recognition, zero-shot learning, object-scene compositions

0

0

0

0

2:43

05/01/2021

Self-Supervised Training for Blind Multi-Frame Video Denoising

Valery Dewil, Jeremy Anger, Axel Davy and
Thibaud Ehret, Gabriele Facciolo, Pablo Arias

Keywords Paper

0

0

0

0

5:02

14/06/2020

Evolving Losses for Unsupervised Video Representation Learning

AJ Piergiovanni, Anelia Angelova, Michael S. Ryoo

Keywords Paper

unsupervised, video, represetnation learning, multi-task, multimodal

0

0

0

0

5:01

14/06/2020

SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

Mohsen Fayyaz, Jürgen Gall

Keywords Paper

action segmentation, action recognition, weakly supervised, set

0

0

0

0

1:01

02/02/2021

Query-Memory Re-Aggregation for Weakly-supervised Video Object Segmentation

Fanchao Lin, Hongtao Xie, Yan Li, Yongdong Zhang

Keywords Paper

0

0

0

0

14:19

02/02/2021

Spatial-temporal Causal Inference for Partial Image-to-video Adaptation

Jin Chen, Xinxiao Wu, Yao Hu, Jiebo Luo

Keywords Paper

0

0

0

0

20:01

02/02/2021

Generalized Zero-Shot Learning via Disentangled Representation

Xiangyu Li, Zhe Xu, Kun Wei, Cheng Deng

Keywords Paper

0

0

0

0

12:10

06/12/2020

ContraGAN: Contrastive Learning for Conditional Image Generation

Minguk Kang, Jaesik Park

Keywords Paper

Neuroscience and Cognitive Science -> Brain Mapping, Neuroscience and Cognitive Science -> Visual Perception

0

0

0

0

3:21

03/05/2021

Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization

Juntae Lee, Mihir Jain, Hyoungwoo Park, Sungrack Yun

Keywords Paper

Action localization, Multimodal Attention, Audio-Visual, Weak-supervision, Event localization

0

0

0

0

5:11

14/06/2020

Learning a Weakly-Supervised Video Actor-Action Segmentation Model With a Wise Selection

Jie Chen, Zhiheng Li, Jiebo Luo, Chenliang Xu

Keywords Paper

video object segmentation, video actor action segmentation, weakly-supervised learning, action recognition, non-reference metric, attention map, self-supervised learning, video understanding, action localization, pseudo-annotation

0

0

0

0

5:00

14/06/2020

Fine-Grained Generalized Zero-Shot Learning via Dense Attribute-Based Attention

Dat Huynh, Ehsan Elhamifar

Keywords Paper

zero-shot learning, few-shot learning, fine-grained recognition, transfer learning, attention

0

0

0

0

1:01

14/06/2020

Dense Regression Network for Video Grounding

Runhao Zeng, Haoming Xu, Wenbing Huang and
Peihao Chen, Mingkui Tan, Chuang Gan

Keywords Paper

video grounding, sparse annotations, dense regression, multi-level fusion

0

0

0

0

0:57

02/02/2021

Contrastive Transformation for Self-supervised Correspondence Learning

Ning Wang, Wengang Zhou, Houqiang Li

Keywords Paper

0

0

0

0

13:41

14/06/2020

Transformation GAN for Unsupervised Image Synthesis and Representation Learning

Jiayu Wang, Wengang Zhou, Guo-Jun Qi and
Zhongqian Fu, Qi Tian, Houqiang Li

Keywords Paper

gan, unsupervised learning, representation learning

0

0

0

0

1:00

06/12/2021

CLIP-It! Language-Guided Video Summarization

Medhini Narasimhan, Anna Rohrbach, Trevor Darrell

Keywords Paper

transformers

0

0

0

0

6:14

06/12/2020

Make One-Shot Video Object Segmentation Efficient Again

Tim Meinhardt, Laura Leal-Taixé

Keywords Paper

0

0

0

0

3:17

06/12/2021

Self-Supervised Multi-Object Tracking with Cross-input Consistency

Favyen Bastani, Songtao He, Samuel Madden

Keywords Paper

self-supervised learning

0

0

0

0

14:59

22/11/2021

Noise-Aware Video Saliency Prediction

Ekta Prashnani, Orazio Gallo, Joohwan Kim and
Josef Spjut, Pradeep Sen, Iuri Frosio

Keywords Paper

video saliency prediction, video game saliency, video saliency dataset, noise-aware training, learning from noisy labels, gaze data acquisition

0

0

0

0

2:58

06/12/2020

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

Humam Alwassel, Dhruv Mahajan, Bruno Korbar and
Lorenzo Torresani, Bernard Ghanem, Du Tran

Keywords Paper

, Applications -> Computer Vision

0

0

0

0

3:17

02/02/2021

Task Aligned Generative Meta-learning for Zero-shot Learning

Zhe Liu, Yun Li, Lina Yao and
Xianzhi Wang, Guodong Long

Keywords Paper

0

0

0

0

15:48

14/06/2020

Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning

Shaobo Min, Hantao Yao, Hongtao Xie and
Chaoqun Wang, Zheng-Jun Zha, Yongdong Zhang

Keywords Paper

generalized zero-shot learning, domain detection, recognition, segmentation, margin loss, bilinear pooling, nas, transfer learning, domain adaption, computer vision.

0

0

0

0

0:58

06/12/2020

Counterfactual Contrastive Learning for Weakly-Supervised Vision-Language Grounding

Zhu Zhang, Zhou Zhao, Zhijie Lin and
jieming zhu, Xiuqiang He

Keywords Paper

0

0

0

0

3:14

22/11/2021

SVD-GAN for Real-Time Unsupervised Video Anomaly Detection

Dinesh Jackson Samuel, Fabio Cuzzolin

Keywords Paper

Unsupervised anomaly detection, SVD-GAN, depth-wise separable convolutions, spatiotemporal features, GAN convergence, Singular Value Decomposition loss, GAN reconstruction, lightweight GAN model, minimized KL divergence

0

0

0

0

2:54

02/02/2021

ACSNet: Action-Context Separation Network for Weakly Supervised Temporal Action Localization

Ziyi Liu, Le Wang, Qilin Zhang and
Wei Tang, Junsong Yuan, Nanning Zheng, Gang Hua

Keywords Paper

0

0

0

0

18:34

22/11/2021

Fine-grained Multi-Modal Self-Supervised Learning

Duo Wang, Salah Karout

Keywords Paper

self-supervised learning, multi-modal learning

0

0

0

0

2:46

02/02/2021

Error-Aware Density Isomorphism Reconstruction for Unsupervised Cross-Domain Crowd Counting

Yuhang He, Zhiheng Ma, Xing Wei and
Xiaopeng Hong, Wei Ke, Yihong Gong

Keywords Paper

0

0

0

0

16:43

05/01/2021

Intra-Class Part Swapping for Fine-Grained Image Classification

Lianbo Zhang, Shaoli Huang, Wei Liu

Keywords Paper

0

0

0

0

4:43

14/06/2020

Syntax-Aware Action Targeting for Video Captioning

Qi Zheng, Chaoyue Wang, Dacheng Tao

Keywords Paper

video and language, video captioning, action predicting

0

0

0

0

1:01

02/02/2021

RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning

Peihao Chen, Deng Huang, Dongliang He and
Xiang Long, Runhao Zeng, Shilei Wen, Mingkui Tan, Chuang Gan

Keywords Paper

0

0

0

0

14:14

06/12/2021

Exploring Cross-Video and Cross-Modality Signals for Weakly-Supervised Audio-Visual Video Parsing

Yan-Bo Lin, Hung-Yu Tseng, Hsin-Ying Lee and
Yen-Yu Lin, Ming-Hsuan Yang

Keywords Paper

0

0

0

0

14:06

30/11/2020

Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting

Sovan Biswas, Juergen Gall

Keywords Paper

0

0

0

0

10:06

02/02/2021

Self-supervised Pre-training and Contrastive Representation Learning for Multiple-choice Video QA

Seonhoon Kim, Seohyeong Jeong, Eunbyul Kim and
Inho Kang, Nojun Kwak

Keywords Paper

0

0

0

0

15:23