Temporal Meta-Adaptor for Video Object Detection

22/11/2021

Temporal Meta-Adaptor for Video Object Detection

Chi Wang, Yang Hua, ZHENG LU, Jian Gao, Neil Robertson

Keywords: video object detection, temporal aggregation, meta-learning, ImageNet VID

Abstract Paper Similar Papers

Abstract: Detecting objects in a video can be difficult due to occlusions and motion blur, where the output features are easily deteriorated. Recent state-of-the-art methods propose to enhance the features of the key frame with reference frames using attention modules. However, the feature enhancement uses the features extracted from a fixed backbone. It is fundamentally hard for a fixed backbone to generate discriminative features for the frames of both low and high quality. To mitigate this challenge, in this paper, we present a meta-learning scheme that learns to adapt the backbone using temporal features. Specifically, we propose to summarise the temporal feature into a fixed size representation, which is then used to make the backbone generate adaptively discriminative features for low and high quality frames. We demonstrate that the proposed approach can be easily incorporated into latest temporal aggregation approaches with almost no impact on the inference speed. Experiments on ImageNet VID dataset show a consistent gain over state-of-the-art methods.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at BMVC 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction

Ruixu Liu, Ju Shen, He Wang and
Chen Chen, Sen-ching Cheung, Vijayan Asari

Keywords Paper

3d human pose, attention mechanism, multi-scale dilation convolution, monocular motion reconstruction

0

0

0

0

5:01

22/11/2021

Spatial-Temporal Residual Aggregation for High Resolution Video Inpainting

Vishnu Sanjay Ramiya Srinivasan, Rui Ma, Qiang Tang and
Zili Yi, Zhan Xu

Keywords Paper

high resolution video inpainting, spatial-temporal aggregation, residual aggregation, spatial-temporal attention, image alignment

0

0

0

0

2:58

02/02/2021

Weakly Supervised Temporal Action Localization Through Learning Explicit Subspaces for Action and Context

Ziyi Liu, Le Wang, Wei Tang and
Junsong Yuan, Nanning Zheng, Gang Hua

Keywords Paper

0

0

0

0

19:49

06/12/2021

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations

Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Luc V Gool

Keywords Paper

self-supervised learning, vision, contrastive learning, representation learning

0

0

0

0

13:32

18/07/2021

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

Nadine Chang, Zhiding Yu, Yu-Xiong Wang and
Anima Anandkumar, Sanja Fidler, Jose Alvarez

Keywords Paper

Applications, Computer Vision

0

0

0

0

5:17

17/08/2020

Learning temporal coherence via self-supervision for GAN-based video generation

Mengyu Chu, You Xie, Jonas Mayer and
Laura Leal-Taixé, Nils Thuerey

Keywords Paper

self-supervision, temporal cycle-consistency, video super-resolution, generative adversarial network, unpaired video translation

0

0

0

0

16:59

02/02/2021

MAMBA: Multi-level Aggregation via Memory Bank for Video Object Detection

Guanxiong Sun, Yang Hua, Guosheng Hu, Neil Robertson

Keywords Paper

0

0

0

0

16:48

05/01/2021

Only Time Can Tell: Discovering Temporal Data for Temporal Modeling

Laura Sevilla-Lara, Shengxin Zha, Zhicheng Yan and
Vedanuj Goswami, Matt Feiszli, Lorenzo Torresani

Keywords Paper

0

0

0

0

4:14

06/12/2021

Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Aadarsh Sahoo, Rutav Shah, Rameswar Panda and
Kate Saenko, Abir Das

Keywords Paper

domain adaptation, contrastive learning

0

0

0

0

13:20

14/06/2020

JA-POLS: A Moving-Camera Background Model via Joint Alignment and Partially-Overlapping Local Subspaces

Irit Chelly, Vlad Winter, Dor Litvak and
David Rosen, Oren Freifeld

Keywords Paper

background subtraction, video analysis, computer vision, machine learning, robust pca, deep learning, moving camera, transfer learning, video surveillance, lie groups

0

0

0

0

1:00

07/09/2020

Making a Case for 3D Convolutions for Object Segmentation in Videos

Sabarinath Mahadevan, Ali Athar, Aljosa Osep and
Laura Leal-Taixé, Bastian Leibe, Sebastian Hennen

Keywords Paper

object tracking, video segmentation, video object segmentation, video scene understanding, object segmentation

0

0

0

0

8:16

14/06/2020

Blurry Video Frame Interpolation

Wang Shen, Wenbo Bao, Guangtao Zhai and
Li Chen, Xiongkuo Min, Zhiyong Gao

Keywords Paper

video frame interpolation, frame-rate up-conversion, video deblurring, pyramid framework, spatial and temporal optimization

0

0

0

0

5:01

02/02/2021

F2Net: Learning to Focus on the Foreground for Unsupervised Video Object Segmentation

Daizong Liu, Dongdong Yu, Changhu Wang, Pan Zhou

Keywords Paper

0

0

0

0

16:59

14/06/2020

Learning Memory-Guided Normality for Anomaly Detection

Hyunjong Park, Jongyoun Noh, Bumsub Ham

Keywords Paper

anomaly detection, unsupervised learning, memory networks, prototypical feature, pattern clustering, feature extraction, video recognition, convolutional neural networks

0

0

0

0

1:01

14/06/2020

SmallBigNet: Integrating Core and Contextual Views for Video Classification

Xianhang Li, Yali Wang, Zhipeng Zhou, Yu Qiao

Keywords Paper

video classification, action recognition, temporal convolution, 3d maxpooling, shared convolution

0

0

0

0

1:00

07/09/2020

Non-Probabilistic Cosine Similarity Loss for Few-Shot Image Classification

Joonhyuk Kim, Inug Yoon, Gyeong-Moon Park, Jong-Hwan Kim

Keywords Paper

few-shot learning, image classification, NPC loss

0

0

0

0

4:59

02/02/2021

CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation

Yang Fu, Linjie Yang, Ding Liu and
Thomas S. Huang, Humphrey Shi

Keywords Paper

0

0

0

0

16:24

02/02/2021

SMART Frame Selection for Action Recognition

Shreyank N Gowda, Marcus Rohrbach, Laura Sevilla-Lara

Keywords Paper

0

0

0

0

14:10

19/08/2021

Enhance Image as You Like with Unpaired Learning

Xiaopeng Sun, Muxingzi Li, Tianyu He, Lubin Fan

Keywords Paper

Computer Vision, 2D and 3D Computer Vision, Applications of Unsupervised Learning

0

0

0

0

11:20

03/05/2021

Self-Supervised Learning of Compressed Video Representations

Youngjae Yu, Sangho Lee, Gunhee Kim, Yale Song

Keywords Paper

self-supervised learning, Compressed videos

0

0

0

0

4:34

22/11/2021

SVD-GAN for Real-Time Unsupervised Video Anomaly Detection

Dinesh Jackson Samuel, Fabio Cuzzolin

Keywords Paper

Unsupervised anomaly detection, SVD-GAN, depth-wise separable convolutions, spatiotemporal features, GAN convergence, Singular Value Decomposition loss, GAN reconstruction, lightweight GAN model, minimized KL divergence

0

0

0

0

2:54

26/04/2020

Efficient and Information-Preserving Future Frame Prediction and Beyond

Wei Yu, Yichao Lu, Steve Easterbrook, Sanja Fidler

Keywords Paper

self-supervised learning, generative pre-training, video prediction, reversible architecture

0

0

0

0

4:18

03/05/2021

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Alex Tamkin, Mike Wu, Noah Goodman

Keywords Paper

representation learning, self-supervised, views, contrastive learning, unsupervised learning, data augmentation

0

0

0

0

5:03

07/09/2020

ViewSynth: Learning Local Features from Depth using View Synthesis

Jisan Mahmud, Rajat Vikram Singh, Peri Akiva and
Spondon Kundu, Kuan-Chuan Peng, Jan-Michael Frahm

Keywords Paper

viewpoint invariant representation learning, depth representation learning, view synthesis, correspondence learning

0

0

0

0

10:00

22/11/2021

Stacked Temporal Attention: Improving First-person Action Recognition by Emphasizing Discriminative Clips

Lijin Yang, Yifei Huang, Yusuke Sugano, Yoichi Sato

Keywords Paper

Egocentric action recognition, Action recognition, Temporal attention

0

0

0

0

3:01

03/05/2021

Disentangled Recurrent Wasserstein Autoencoder

Jun Han, Martin Min, Ligong Han and
Li Erran Li, Xuan Zhang

Keywords Paper

Recurrent Generative Model, Sequential Representation Learning, Disentanglement

0

0

0

0

9:17

02/02/2021

Query-Memory Re-Aggregation for Weakly-supervised Video Object Segmentation

Fanchao Lin, Hongtao Xie, Yan Li, Yongdong Zhang

Keywords Paper

0

0

0

0

14:19

02/02/2021

Spatial-temporal Causal Inference for Partial Image-to-video Adaptation

Jin Chen, Xinxiao Wu, Yao Hu, Jiebo Luo

Keywords Paper

0

0

0

0

20:01

05/01/2021

Intra-Class Part Swapping for Fine-Grained Image Classification

Lianbo Zhang, Shaoli Huang, Wei Liu

Keywords Paper

0

0

0

0

4:43

22/11/2021

Feature Fusion Vision Transformer for Fine-Grained Visual Categorization

Jun Wang, Xiaohan Yu, Yongsheng Gao

Keywords Paper

Fine-grained visual categorization, Vision transformer, Self-attention, Feature Fusion

0

0

0

0

3:02

14/06/2020

Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation

Yangtao Zheng, Di Huang, Songtao Liu, Yunhong Wang

Keywords Paper

transfer learning, unsupervised domain adaptation, object detection, adversarial learning, spatial attention, prototypes

0

0

0

0

1:01

03/05/2021

SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization

A F M Shahab Uddin, Mst. Sirazam Monira, Wheemyung Shin and
TaeChoong Chung, Sung-Ho Bae

Keywords Paper

Regularization, Data Augmentation, Saliency Guided Data Augmentation, SaliencyMix

0

0

0

0

3:36

06/12/2021

When does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning?

Lijie Fan, Sijia Liu, Pin-Yu Chen and
Gaoyuan Zhang, Chuang Gan

Keywords Paper

machine learning, robustness, adversarial robustness and security, self-supervised learning, vision, contrastive learning, clustering

0

0

0

0

7:33

26/04/2020

U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation

Junho Kim, Minjae Kim, Hyeonwoo Kang, Kwang Hee Lee

Keywords Paper

Image-to-Image Translation, Generative Attentional Networks, Adaptive Layer-Instance Normalization

0

0

1

1

5:08

30/11/2020

Mask-Ranking Network for Semi-Supervised Video Object Segmentation

Wenjing Li, Xiang Zhang, Yujie Hu, Yingqi Tang

Keywords Paper

0

0

0

0

5:36

02/02/2021

Semantic Grouping Network for Video Captioning

Hobin Ryu, Sunghun Kang, Haeyong Kang, Chang D. Yoo

Keywords Paper

0

0

0

0

17:41

03/05/2021

Counterfactual Generative Networks

Axel Sauer, Andreas Geiger

Keywords Paper

Generative Models, Data Augmentation, Image Classification, Counterfactuals, Robustness, Causality

0

0

0

0

5:25

14/06/2020

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting

Zhuoqian Yang, Wentao Zhu, Wayne Wu and
Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy

Keywords Paper

motion retargeting, disentanglement, representation learning, video generation

0

0

0

0

1:02

05/01/2021

Coarse-to-Fine Gaze Redirection With Numerical and Pictorial Guidance

Jingjing Chen, Jichao Zhang, Enver Sangineto and
Tao Chen, Jiayuan Fan, Nicu Sebe

Keywords Paper

0

0

0

0

4:34

02/02/2021

Explicitly Modeled Attention Maps for Image Classification

Andong Tan, Duc Tam Nguyen, Maximilian Dax and
Matthias Nießner, Thomas Brox

Keywords Paper

0

0

0

0

16:59