Actor-Transformers for Group Activity Recognition

14/06/2020

Actor-Transformers for Group Activity Recognition

Kirill Gavrilyuk, Ryan Sanford, Mehrsan Javan, Cees G. M. Snoek

Keywords: group activity recognition, action recognition, transformer, pose, 3d cnn

Abstract Paper Similar Papers

Abstract: This paper strives to recognize individual actions and group activities from videos. While existing solutions for this challenging problem explicitly model spatial and temporal relationships based on location of individual actors, we propose an actor-transformer model able to learn and selectively extract information relevant for group activity recognition. We feed the transformer with rich actor-specific static and dynamic representations expressed by features from a 2D pose network and 3D CNN, respectively. We empirically study different ways to combine these representations and show their complementary benefits. Experiments show what is important to transform and how it should be transformed. What is more, actor-transformers achieve state-of-the-art results on two publicly available benchmarks for group activity recognition, outperforming the previous best published results by a considerable margin

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CVPR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Learning Visual Context for Group Activity Recognition

Hangjie Yuan, Dong Ni

Keywords Paper

0

0

0

0

16:54

16/11/2020

Attentional Separation-and-Aggregation Network for Self-supervised Depth-Pose Learning in Dynamic Scenes

Feng Gao, Jincheng Yu, Hao Shen and
Yu Wang, Huazhong Yang

Keywords Paper

0

0

0

0

4:39

14/06/2020

Syntax-Aware Action Targeting for Video Captioning

Qi Zheng, Chaoyue Wang, Dacheng Tao

Keywords Paper

video and language, video captioning, action predicting

0

0

0

0

1:01

07/09/2020

Attentive Action and Context Factorization

Yang Wang, Vinh Tran, Gedas Bertasius and
Lorenzo Torresani, Minh Hoai Nguyen

Keywords Paper

action factorization, attention, conjugate samples

0

0

0

0

9:59

14/06/2020

Intra- and Inter-Action Understanding via Temporal Action Parsing

Dian Shao, Yue Zhao, Bo Dai, Dahua Lin

Keywords Paper

action understanding, temporal action parsing, action dataset

0

0

0

0

1:00

14/06/2020

Searching for Actions on the Hyperbole

Teng Long, Pascal Mettes, Heng Tao Shen, Cees G. M. Snoek

Keywords Paper

video retrieval, hyperbolic learning, hierarchical, zero-shot learning, action recognition, hyperbolic geometry

0

0

0

0

1:00

06/12/2021

Associating Objects with Transformers for Video Object Segmentation

Zongxin Yang, Yunchao Wei, Yi Yang

Keywords Paper

transformers

0

0

0

0

12:29

30/11/2020

Mask-Ranking Network for Semi-Supervised Video Object Segmentation

Wenjing Li, Xiang Zhang, Yujie Hu, Yingqi Tang

Keywords Paper

0

0

0

0

5:36

14/06/2020

Video Instance Segmentation Tracking With a Modified VAE Architecture

Chung-Ching Lin, Ying Hung, Rogerio Feris, Linglin He

Keywords Paper

video instance segmentation, video object tracking, variational autoencoder, vae, gaussian process, multi-task learning

0

0

0

0

1:01

06/12/2021

End-to-end Multi-modal Video Temporal Grounding

Yi-Wen Chen, Yi-Hsuan Tsai, Ming-Hsuan Yang

Keywords Paper

self-supervised learning, transformers, vision, contrastive learning

0

0

0

0

8:46

05/01/2021

3D Human Pose and Shape Estimation Through Collaborative Learning and Multi-View Model-Fitting

Zhongguo Li, Magnus Oskarsson, Anders Heyden

Keywords Paper

0

0

0

0

5:13

12/07/2020

Feature-map-level Online Adversarial Knowledge Distillation

Inseop Chung, SeongUk Park, Kim Jangho, NOJUN KWAK

Keywords Paper

Applications - Computer Vision

0

0

0

0

14:06

02/02/2021

CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation

Yang Fu, Linjie Yang, Ding Liu and
Thomas S. Huang, Humphrey Shi

Keywords Paper

0

0

0

0

16:24

06/12/2021

Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Aadarsh Sahoo, Rutav Shah, Rameswar Panda and
Kate Saenko, Abir Das

Keywords Paper

domain adaptation, contrastive learning

0

0

0

0

13:20

02/02/2021

Adversarial Pose Regression Network for Pose-Invariant Face Recognitions

Pengyu Li, Biao Wang, Lei Zhang

Keywords Paper

0

0

0

0

15:17

14/06/2020

Deep Image Spatial Transformation for Person Image Generation

Yurui Ren, Xiaoming Yu, Junming Chen and
Thomas H. Li, Ge Li

Keywords Paper

pose transfer, image animation, spatial transformation, local attention, novel view synthesis, pose-guided person image generation

0

0

0

0

1:00

02/02/2021

Generalized Adversarially Learned Inference

Yatin Dandi, Homanga Bharadhwaj, Abhishek Kumar, Piyush Rai

Keywords Paper

0

0

0

0

16:22

22/11/2021

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

Khoa HV Vo, Hyekang Joo, Kashu Yamazaki and
Sang Q Truong, Kris Kitani, Minh-Triet Tran, Ngan Le

Keywords Paper

temporal action proposal, temporal action detection, video understanding

0

0

0

0

9:10

06/12/2020

Delving into the Cyclic Mechanism in Semi-supervised Video Object Segmentation

Yuxi Li, Ning Xu, Jinlong Peng and
John See, Weiyao Lin

Keywords Paper

0

0

0

0

2:56

22/11/2021

Hierarchical Contrastive Motion Learning for Video Action Recognition

Xitong Yang, Xiaodong Yang, Sifei Liu and
Deqing Sun, Larry Davis, Jan Kautz

Keywords Paper

action recognition, motion hierarchy, motion representation, contrastive learning

0

0

0

0

8:29

02/02/2021

Augmented Partial Mutual Learning with Frame Masking for Video Captioning

Ke Lin, Zhuoxin Gan, Liwei Wang

Keywords Paper

0

0

0

0

16:57

30/11/2020

Learning End-to-End Action Interaction by Paired-Embedding Data Augmentation

Ziyang Song, Zejian Yuan, Chong Zhang and
Wanchao Chi, Yonggen Ling, Shenghao Zhang

Keywords Paper

0

0

0

0

6:47

22/11/2021

Few-Shot Temporal Action Localization with Query Adaptive Transformer

Sauradip Nag, Xiatian Zhu, Tao Xiang

Keywords Paper

temporal action localization, few shot learning, transformer, class imbalance, meta learning, action detection

0

0

0

0

2:56

16/11/2020

Self-Supervised 3D Keypoint Learning for Ego-Motion Estimation

Jiexiong Tang, Rareș Ambruș, Vitor Guizilini and
Sudeep Pillai, Hanme Kim, Patric Jensfelt, Adrien Gaidon

Keywords Paper

0

0

0

0

5:05

02/02/2021

Regional Attention with Architecture-Rebuilt 3D Network for RGB-D Gesture Recognition

Benjia Zhou, Yunan Li, Jun Wan

Keywords Paper

0

0

0

0

13:16

30/11/2020

HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning

Axel Barroso-Laguna, Yannick Verdie, Benjamin Busam, Krystian Mikolajczyk

Keywords Paper

0

0

0

0

10:03

07/09/2020

Attention Distillation for Learning Video Representations

Miao Liu, Xin Chen, Yun Zhang and
Yin Li, James Rehg

Keywords Paper

Action Recognition, Deep Learning, Representation Learning

0

0

0

0

9:50

16/11/2020

CoT-AMFlow: Adaptive Modulation Network with Co-Teaching Strategy for Unsupervised Optical Flow Estimation

Hengli Wang, Rui Fan, Ming Liu

Keywords Paper

0

0

0

0

4:57

02/02/2021

Proposal-Free Video Grounding with Contextual Pyramid Network

Kun Li, Dan Guo, Meng Wang

Keywords Paper

0

0

0

0

14:19

05/01/2021

Coarse Temporal Attention Network (CTA-Net) for Driver's Activity Recognition

Zachary Wharton, Ardhendu Behera, Yonghuai Liu, Nik Bessis

Keywords Paper

0

0

0

0

5:30

14/06/2020

ActionBytes: Learning From Trimmed Videos to Localize Actions

Mihir Jain, Amir Ghodrati, Cees G. M. Snoek

Keywords Paper

action localization, weakly-supervised, self-supervised learning, action proposals, zero-shot, thumos14, activitynet, multithumos, self-training, temporal segmentation

0

0

0

0

1:01

14/06/2020

Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs

Jingwei Ji, Ranjay Krishna, Li Fei-Fei, Juan Carlos Niebles

Keywords Paper

action recognition, scene graph, video understanding, relationships, composition, action, activity, video

0

0

0

0

1:01

06/12/2021

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

Shih-Yang Su, Frank Yu, Michael Zollhoefer, Helge Rhodin

Keywords Paper

deep learning, vision, generative model

0

0

0

0

8:02

22/11/2021

LARNet: Latent Action Representation for Human Action Synthesis

Naman Biyani, Aayush Jung Bahadur Rana, Shruti Vyas, Yogesh Rawat

Keywords Paper

action synthesis, video synthesis, joint generative model, human action generation, end-to-end learning, conditional video generation

0

0

0

0

3:02

05/01/2021

Hand Pose Guided 3D Pooling for Word-Level Sign Language Recognition

Al Amin Hosain, Panneer Selvam Santhalingam, Parth Pathak and
Huzefa Rangwala, Jana Kosecka

Keywords Paper

0

0

0

0

4:39

12/07/2020

ROMA: Multi-Agent Reinforcement Learning with Emergent Roles

Tonghan Wang, Heng Dong, Victor Lesser, Chongjie Zhang

Keywords Paper

Planning, Control, and Multiagent Learning

0

0

0

0

13:48

14/06/2020

AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation

Hyeongmin Lee, Taeoh Kim, Tae-young Chung and
Daehyun Pak, Yuseok Ban, Sangyoun Lee

Keywords Paper

video frame interpolation, video temporal super-resolution, frame rate up conversion, frame synthesis, motion estimation, motion compensation, frame warping

0

0

0

0

1:01

14/06/2020

Light-weight Calibrator: A Separable Component for Unsupervised Domain Adaptation

Shaokai Ye, Kailu Wu, Mu Zhou and
Yunfei Yang, Sia Huat Tan, Kaidi Xu, Jiebo Song, Chenglong Bao, Kaisheng Ma

Keywords Paper

domain adaptation, adversarial attack, adversarial learning, unsupervised learning, model compression, generative adversarial networks

0

0

0

0

0:51

06/12/2021

Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering

Youngjoong Kwon, Dahun Kim, Duygu Ceylan, Henry Fuchs

Keywords Paper

transformers, vision

0

0

0

0

11:55

05/01/2021

Mask Selection and Propagation for Unsupervised Video Object Segmentation

Shubhika Garg, Vidit Goel

Keywords Paper

0

0

0

0

4:38