Dynamic Graph Warping Transformer for Video Alignment

22/11/2021

Dynamic Graph Warping Transformer for Video Alignment

Junyan Wang, Yang Long, Maurice Pagnucco, Yang Song

Keywords: Video alignment, Transformer, Graph Neural Network

Abstract Paper Similar Papers

Abstract: Video alignment aims to match synchronised action information between multiple video sequences. Existing methods are typically based on supervised learning to align video frames according to annotated action phases. However, such phase-level annotation cannot effectively guide frame-level alignment, since each phase can be completed at different speeds across individuals. In this paper, we introduce dynamic warping to take between-video information into account with a new Dynamic Graph Warping Transformer (DGWT) network model. Our approach is the first Graph Transformer framework designed for video analysis and alignment. In particular, a novel dynamic warping loss function is designed to align videos of arbitrary length using attention-level features. A Temporal Segment Graph (TSG) is proposed to enable the adjacency matrix to cope with temporal information in video data. Our experimental results on two public datasets (Penn Action and Pouring) demonstrate significant improvements over state-of-the-art approaches.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at BMVC 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

G-TAD: Sub-Graph Localization for Temporal Action Detection

Mengmeng Xu, Chen Zhao, David S. Rojas and
Ali Thabet, Bernard Ghanem

Keywords Paper

temporal action detection, adaptive semantic context, subgraph localization, graph convolution, gcnext, graph alignment, thumos14, activitynet1.3

0

0

0

0

1:01

22/11/2021

Hierarchical Contrastive Motion Learning for Video Action Recognition

Xitong Yang, Xiaodong Yang, Sifei Liu and
Deqing Sun, Larry Davis, Jan Kautz

Keywords Paper

action recognition, motion hierarchy, motion representation, contrastive learning

0

0

0

0

8:29

02/02/2021

Anticipating Future Relations via Graph Growing for Action Prediction

Xinxiao Wu, Jianwei Zhao, Ruiqi Wang

Keywords Paper

0

0

0

0

14:44

06/12/2021

Discovering Dynamic Salient Regions for Spatio-Temporal Graph Neural Networks

Iulia Duta, Andrei L Nicolicioiu, Marius Leordeanu

Keywords Paper

deep learning, machine learning, graph learning

0

0

0

0

9:50

02/02/2021

Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers

Shijie Geng, Peng Gao, Moitreya Chatterjee and
Chiori Hori, Jonathan Le Roux, Yongfeng Zhang, Hongsheng Li, Anoop Cherian

Keywords Paper

0

0

0

0

19:36

03/05/2021

Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning

Zhenfang Chen, Jiayuan Mao, Jiajun Wu and
Kwan-Yee K Wong, Joshua B Tenenbaum, Chuang Gan

Keywords Paper

Visual Reasoning, Video Reasoning, Neuro-Symbolic Learning, Concept Learning

0

0

0

0

4:58

14/06/2020

Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs

Jingwei Ji, Ranjay Krishna, Li Fei-Fei, Juan Carlos Niebles

Keywords Paper

action recognition, scene graph, video understanding, relationships, composition, action, activity, video

0

0

0

0

1:01

22/11/2021

Conditional Model Selection for Efficient Video Understanding

Mihir Jain, Haitam Ben Yahia, Amir Ghodrati and
Amirhossein Habibian, Fatih Porikli

Keywords Paper

action recognition, efficient classification, efficient localization, conditional compute

0

0

0

0

2:49

14/09/2020

Feedback-guided Attributed Graph Embedding for Relevant Video Recommendation

Taofeng Xue, Xinzhou Dong, Wei Zhuo and
Beihong Jin, He Chen, Wenhai Pan, Beibei Li, Xuejian Zhang

Keywords Paper

recommender system, representation learning, graph embedding, user behavior mining

0

0

0

0

15:07

02/02/2021

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Dong Wang, Di Hu, Xingjian Li, Dejing Dou

Keywords Paper

0

0

0

0

17:10

14/06/2020

Beyond Short-Term Snippet: Video Relation Detection With Spatio-Temporal Global Context

Chenchen Liu, Yang Jin, Kehan Xu and
Guoqiang Gong, Yadong Mu

Keywords Paper

video visual relation detection, visual relation detection, deep learning

0

0

0

0

1:01

14/06/2020

Deep Homography Estimation for Dynamic Scenes

Hoang Le, Feng Liu, Shu Zhang, Aseem Agarwala

Keywords Paper

homography estimation, dynamic scenes, motion estimation, multi-task learning, deep learning

0

0

0

0

1:01

03/05/2021

Self-Supervised Learning of Compressed Video Representations

Youngjae Yu, Sangho Lee, Gunhee Kim, Yale Song

Keywords Paper

self-supervised learning, Compressed videos

0

0

0

0

4:34

18/07/2021

Compositional Video Synthesis with Action Graphs

Amir Bar, Roei Herzig, Xiaolong Wang and
Anna Rohrbach, Gal Chechik, Prof. Darrell, Amir Globerson

Keywords Paper

Applications, Computer Vision

0

0

0

0

4:55

05/01/2021

Data-Efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions

Jianan Wang, Boyang Li, Xiangyu Fan and
Jing Lin, Yanwei Fu

Keywords Paper

0

0

0

0

4:49

22/11/2021

Unsupervised Spatio-temporal Latent Feature Clustering for Multiple-object Tracking and Segmentation

Abubakar Siddique, Reza Jalil Mozhdehi, Henry Medeiros

Keywords Paper

Unsupervised learning, Subspace clustering, Heterogeneous autoencoder, Constraints k-means, Multi-task learning, Uncertainty learning, MOTS

0

0

0

0

2:57

22/11/2021

GTA: Global Temporal Attention for Video Action Understanding

Bo He, Xitong Yang, Zuxuan Wu and
Hao Chen, Ser-Nam Lim, Abhinav Shrivastava

Keywords Paper

action recognition, self-attention, temporal modeling

0

0

0

0

2:55

14/06/2020

AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation

Hyeongmin Lee, Taeoh Kim, Tae-young Chung and
Daehyun Pak, Yuseok Ban, Sangyoun Lee

Keywords Paper

video frame interpolation, video temporal super-resolution, frame rate up conversion, frame synthesis, motion estimation, motion compensation, frame warping

0

0

0

0

1:01

06/12/2021

Learning from Inside: Self-driven Siamese Sampling and Reasoning for Video Question Answering

Weijiang Yu, Haoteng Zheng, Mengfei Li and
Lei Ji, Lijun Wu, Nong Xiao, Nan Duan

Keywords Paper

transformers

0

0

0

0

13:47

14/06/2020

Time Flies: Animating a Still Image With Time-Lapse Video As Reference

Chia-Chi Cheng, Hung-Yu Chen, Wei-Chen Chiu

Keywords Paper

time-lapse video animation, self-supervised learning, style transfer, temporal consistency

0

0

0

0

1:01

14/06/2020

SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

Mohsen Fayyaz, Jürgen Gall

Keywords Paper

action segmentation, action recognition, weakly supervised, set

0

0

0

0

1:01

22/11/2021

Back to the Future: Cycle Encoding Prediction for Self-supervised Video Representation Learning

Xinyu Yang, Majid Mirmehdi, Tilo Burghardt

Keywords Paper

unsupervised learning, self-supervised learning, video self-supervised learning, contrastive learning, representation learning, cycle consistency, temporal prediction, action recognition

0

0

0

0

2:59

22/11/2021

Few-Shot Temporal Action Localization with Query Adaptive Transformer

Sauradip Nag, Xiatian Zhu, Tao Xiang

Keywords Paper

temporal action localization, few shot learning, transformer, class imbalance, meta learning, action detection

0

0

0

0

2:56

02/02/2021

Spatial-temporal Causal Inference for Partial Image-to-video Adaptation

Jin Chen, Xinxiao Wu, Yao Hu, Jiebo Luo

Keywords Paper

0

0

0

0

20:01

26/04/2020

AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures

Michael S. Ryoo, AJ Piergiovanni, Mingxing Tan, Anelia Angelova

Keywords Paper

video representation learning, video understanding, activity recognition, neural architecture search

0

0

0

0

5:02

14/06/2020

Few-Shot Video Classification via Temporal Alignment

Kaidi Cao, Jingwei Ji, Zhangjie Cao and
Chien-Yi Chang, Juan Carlos Niebles

Keywords Paper

video classification, few-shot learning, action recognition, temporal alignment

0

0

0

0

0:57

14/06/2020

Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning

Shizhe Chen, Yida Zhao, Qin Jin, Qi Wu

Keywords Paper

video-text retrieval, cross-modal matching, graph neural network

0

0

0

0

1:01

16/11/2020

Learn to Cross-lingual Transfer with Meta Graph Learning Across Heterogeneous Languages

Zheng Li, Mukul Kumar, William Headden and
Bing Yin, Ying Wei, Yu Zhang, Qiang Yang

Keywords Paper

cross-lingual transfer, clt task, multilingual, mplm

0

0

0

0

11:49

14/06/2020

Syntax-Aware Action Targeting for Video Captioning

Qi Zheng, Chaoyue Wang, Dacheng Tao

Keywords Paper

video and language, video captioning, action predicting

0

0

0

0

1:01

05/01/2021

DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video

Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando and
Hongdong Li, Stephen Gould

Keywords Paper

0

0

0

0

5:02

06/12/2020

Cycle-Contrast for Self-Supervised Video Representation Learning

Quan Kong, Wenpeng Wei, Ziwei Deng and
Tomoaki Yoshinaga, Tomokazu Murakami

Keywords Paper

0

0

0

0

3:13

30/11/2020

Interpreting Video Features: A Comparison of 3D Convolutional Networks and Convolutional LSTM Networks

Joonatan Mänttäri, Sofia Broomé, John Folkesson, Hedvig Kjellström

Keywords Paper

0

0

0

0

9:52

14/06/2020

Online Depth Learning Against Forgetting in Monocular Videos

Zhenyu Zhang, Stéphane Lathuilière, Elisa Ricci and
Nicu Sebe, Yan Yan, Jian Yang

Keywords Paper

depth estimation, online adaptation, domain adaptation, meta-learning, online learning

0

0

0

0

0:59

22/11/2021

LARNet: Latent Action Representation for Human Action Synthesis

Naman Biyani, Aayush Jung Bahadur Rana, Shruti Vyas, Yogesh Rawat

Keywords Paper

action synthesis, video synthesis, joint generative model, human action generation, end-to-end learning, conditional video generation

0

0

0

0

3:02

02/02/2021

CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation

Yang Fu, Linjie Yang, Ding Liu and
Thomas S. Huang, Humphrey Shi

Keywords Paper

0

0

0

0

16:24

07/09/2020

Refinement of Boundary Regression Using Uncertainty in Temporal Action Localization

Yunze Chen, Mengjuan Chen, Rui Wu and
Jiagang Zhu, Zheng Zhu, Qingyi Gu

Keywords Paper

Temporal Action Localization, Temporal Action Detection, Activity recognition and understanding

0

0

0

0

5:09

02/02/2021

Weakly Supervised Temporal Action Localization Through Learning Explicit Subspaces for Action and Context

Ziyi Liu, Le Wang, Wei Tang and
Junsong Yuan, Nanning Zheng, Gang Hua

Keywords Paper

0

0

0

0

19:49

14/06/2020

Scene-Adaptive Video Frame Interpolation via Meta-Learning

Myungsub Choi, Janghoon Choi, Sungyong Baik and
Tae Hyun Kim, Kyoung Mu Lee

Keywords Paper

video frame interpolation, test-time adaptation, meta-learning, self-supervision, image synthesis, slow motion, motion estimation, error correction, maml, input-adaptive neural network

0

0

0

0

0:55

14/06/2020

Searching for Actions on the Hyperbole

Teng Long, Pascal Mettes, Heng Tao Shen, Cees G. M. Snoek

Keywords Paper

video retrieval, hyperbolic learning, hierarchical, zero-shot learning, action recognition, hyperbolic geometry

0

0

0

0

1:00

02/02/2021

Graph-Enhanced Multi-Task Learning of Multi-Level Transition Dynamics for Session-based Recommendation

Chao Huang, Jiahui Chen, Lianghao Xia and
Yong Xu, Peng Dai, Yanqing Chen, Liefeng Bo, Jiashu Zhao, Jimmy Xiangji Huang

Keywords Paper

0

0

0

0

17:12