RefineLoc: Iterative Refinement for Weakly-Supervised Action Localization

05/01/2021

RefineLoc: Iterative Refinement for Weakly-Supervised Action Localization

Alejandro Pardo, Humam Alwassel, Fabian Caba, Ali Thabet, Bernard Ghanem

Keywords:

Abstract Paper Similar Papers

Abstract: Video action detectors are usually trained using datasets with fully-supervised temporal annotations. Building such datasets is an expensive task. To alleviate this problem, recent methods have tried to leverage weak labeling, where videos are untrimmed and only a video-level label is available. In this paper, we propose RefineLoc, a novel weakly-supervised temporal action localization method. RefineLoc uses an iterative refinement approach by estimating and training on snippet-level pseudo ground truth at every iteration. We show the benefit of this iterative approach and present an extensive analysis of five different pseudo ground truth generators. We show the effectiveness of our model on two standard action datasets, ActivityNet v1.2 and THUMOS14. RefineLoc shows competitive results with the state-of-the-art in weakly-supervised temporal localization. Additionally, our iterative refinement process is able to significantly improve the performance of two state-of-the-art methods, setting a new state-of-the-art on THUMOS14.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at WACV 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

Self-Supervised Monocular Trained Depth Estimation Using Self-Attention and Discrete Disparity Volume

Adrian Johnston, Gustavo Carneiro

Keywords Paper

self-supervised depth estimation, self-supervised learning, self-attention, depth estimation, uncertainty

0

0

0

0

1:01

14/06/2020

Time Flies: Animating a Still Image With Time-Lapse Video As Reference

Chia-Chi Cheng, Hung-Yu Chen, Wei-Chen Chiu

Keywords Paper

time-lapse video animation, self-supervised learning, style transfer, temporal consistency

0

0

0

0

1:01

14/06/2020

Few-Shot Video Classification via Temporal Alignment

Kaidi Cao, Jingwei Ji, Zhangjie Cao and
Chien-Yi Chang, Juan Carlos Niebles

Keywords Paper

video classification, few-shot learning, action recognition, temporal alignment

0

0

0

0

0:57

02/02/2021

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Wenhao Wu, Dongliang He, Tianwei Lin and
Fu Li, Chuang Gan, Errui Ding

Keywords Paper

0

0

0

0

14:02

14/06/2020

Learning Fast and Robust Target Models for Video Object Segmentation

Andreas Robinson, Felix Järemo Lawin, Martin Danelljan and
Fahad Shahbaz Khan, Michael Felsberg

Keywords Paper

video object segmentation, semi-supervised

0

0

0

0

4:57

07/09/2020

Making a Case for 3D Convolutions for Object Segmentation in Videos

Sabarinath Mahadevan, Ali Athar, Aljosa Osep and
Laura Leal-Taixé, Bastian Leibe, Sebastian Hennen

Keywords Paper

object tracking, video segmentation, video object segmentation, video scene understanding, object segmentation

0

0

0

0

8:16

02/02/2021

SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning

Ting Yao, Yiheng Zhang, Zhaofan Qiu and
Yingwei Pan, Tao Mei

Keywords Paper

0

0

0

0

16:17

06/12/2020

Blind Video Temporal Consistency via Deep Video Prior

Chenyang Lei, Yazhou Xing, Qifeng Chen

Keywords Paper

0

0

0

0

3:05

14/06/2020

TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model

Bo Pang, Yizhuo Li, Yifan Zhang and
Muchen Li, Cewu Lu

Keywords Paper

bounding-tube, mot, one-stage, tube-nms, fcn

0

0

0

0

4:55

05/01/2021

Temporal Stochastic Softmax for 3D CNNs: An Application in Facial Expression Recognition

Theo Ayral, Marco Pedersoli, Simon Bacon, Eric Granger

Keywords Paper

0

0

0

0

5:00

05/01/2021

Temporal Shift GAN for Large Scale Video Generation

Andres Munoz, Mohammadreza Zolfaghari, Max Argus, Thomas Brox

Keywords Paper

0

0

0

0

5:01

14/06/2020

Scene-Adaptive Video Frame Interpolation via Meta-Learning

Myungsub Choi, Janghoon Choi, Sungyong Baik and
Tae Hyun Kim, Kyoung Mu Lee

Keywords Paper

video frame interpolation, test-time adaptation, meta-learning, self-supervision, image synthesis, slow motion, motion estimation, error correction, maml, input-adaptive neural network

0

0

0

0

0:55

14/06/2020

Self-Trained Deep Ordinal Regression for End-to-End Video Anomaly Detection

Guansong Pang, Cheng Yan, Chunhua Shen and
Anton van den Hengel, Xiao Bai

Keywords Paper

anomaly detection, deep ordinal regression, human-in-the-loop machine learning, anomaly explanation, self-training, unsupervised representation learning, abnormal activity detection, video learning

0

0

0

0

1:01

14/06/2020

Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction

Ruixu Liu, Ju Shen, He Wang and
Chen Chen, Sen-ching Cheung, Vijayan Asari

Keywords Paper

3d human pose, attention mechanism, multi-scale dilation convolution, monocular motion reconstruction

0

0

0

0

5:01

22/11/2021

StyleVideoGAN: A Temporal Generative Model using a Pretrained StyleGAN

Gereon Fox, Ayush Tewari, Mohamed Elgharib, Christian Theobalt

Keywords Paper

video generation, StyleGAN, GAN, embedding, faces, hands, cars, RNN

0

0

0

0

8:07

19/08/2021

Learning Implicit Temporal Alignment for Few-shot Video Classification

Songyang Zhang, Jiale Zhou, Xuming He

Keywords Paper

Computer Vision, Action Recognition, Deep Learning

0

0

0

0

6:20

26/04/2020

CATER: A diagnostic dataset for Compositional Actions & TEmporal Reasoning

Rohit Girdhar, Deva Ramanan

Keywords Paper

Video Understanding, Temporal Reasoning

0

0

0

0

14:56

02/02/2021

Query-Memory Re-Aggregation for Weakly-supervised Video Object Segmentation

Fanchao Lin, Hongtao Xie, Yan Li, Yongdong Zhang

Keywords Paper

0

0

0

0

14:19

14/06/2020

Learning a Weakly-Supervised Video Actor-Action Segmentation Model With a Wise Selection

Jie Chen, Zhiheng Li, Jiebo Luo, Chenliang Xu

Keywords Paper

video object segmentation, video actor action segmentation, weakly-supervised learning, action recognition, non-reference metric, attention map, self-supervised learning, video understanding, action localization, pseudo-annotation

0

0

0

0

5:00

22/11/2021

Conditional Model Selection for Efficient Video Understanding

Mihir Jain, Haitam Ben Yahia, Amir Ghodrati and
Amirhossein Habibian, Fatih Porikli

Keywords Paper

action recognition, efficient classification, efficient localization, conditional compute

0

0

0

0

2:49

14/06/2020

Set-Constrained Viterbi for Set-Supervised Action Segmentation

Jun Li, Sinisa Todorovic

Keywords Paper

weakly supervised learning, action segmentation, set-constrained viterbi

0

0

0

0

1:01

02/02/2021

CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation

Yang Fu, Linjie Yang, Ding Liu and
Thomas S. Huang, Humphrey Shi

Keywords Paper

0

0

0

0

16:24

07/09/2020

Revisiting Temporal Modeling for Video Super-resolution

Takashi Isobe, Fang Zhu, Shengjin Wang

Keywords Paper

Video Super-Resolution, Recurrent Neural Network, Temporal Modeling

0

0

0

0

5:56

22/11/2021

SVD-GAN for Real-Time Unsupervised Video Anomaly Detection

Dinesh Jackson Samuel, Fabio Cuzzolin

Keywords Paper

Unsupervised anomaly detection, SVD-GAN, depth-wise separable convolutions, spatiotemporal features, GAN convergence, Singular Value Decomposition loss, GAN reconstruction, lightweight GAN model, minimized KL divergence

0

0

0

0

2:54

06/12/2020

An Unsupervised Information-Theoretic Perceptual Quality Metric

Sangnie Bhardwaj, Ian Fischer, Johannes Ballé, Troy Chinen

Keywords Paper

0

0

0

0

3:08

14/06/2020

Rethinking Zero-Shot Video Classification: End-to-End Training for Realistic Applications

Biagio Brattoli, Joseph Tighe, Fedor Zhdanov and
Pietro Perona, Krzysztof Chalupka

Keywords Paper

zero-shot learning, video classification, end-to-end, word2vec, visual to semantic, limited supervision, r3d, kinetics, sun, ucf101

0

0

0

0

1:01

22/11/2021

Knowing What, Where and When to Look: Video Action modelling with Attention

Juan-Manuel Perez-Rua, Brais Martinez, Xiatian Zhu and
Antoine S Toisoul, Victor A Escorcia, Tao Xiang

Keywords Paper

Action recognition, Fine-grained action, video attention, Spatial attention, Channel attention, Temporal attention, Spatio-temporal attention, Feature refinement

0

0

0

0

2:46

14/06/2020

MAST: A Memory-Augmented Self-Supervised Tracker

Zihang Lai, Erika Lu, Weidi Xie

Keywords Paper

self-supervised learning, video segmentation, memory-augmented model, video understanding, tracking, unsupervised learning, generalization, attention, representation learning, metric learning

0

0

0

0

1:01

14/06/2020

Unsupervised Learning From Video With Deep Neural Embeddings

Chengxu Zhuang, Tianwei She, Alex Andonian and
Max Sobol Mark, Daniel Yamins

Keywords Paper

unsupervised learning, self-supervised learning, video learning, contrastive learning, deep neural networks, action recognition, object recognition, two-pathway models

0

0

0

0

1:01

18/07/2021

Is Space-Time Attention All You Need for Video Understanding?

Gedas Bertasius, Heng Wang, Lorenzo Torresani

Keywords Paper

, Algorithms, AutoML, Deep Learning, Architectures

0

0

0

0

5:15

05/01/2021

Weakly Supervised Deep Reinforcement Learning for Video Summarization With Semantically Meaningful Reward

Zutong Li, Lei Yang

Keywords Paper

0

0

0

0

4:54

06/12/2021

Dynamic Normalization and Relay for Video Action Recognition

Dongqi Cai, Anbang Yao, Yurong Chen

Keywords Paper

deep learning, representation learning

0

0

0

0

10:42

14/06/2020

Learning Multi-Object Tracking and Segmentation From Automatic Annotations

Lorenzo Porzi, Markus Hofinger, Idoia Ruiz and
Joan Serrat, Samuel Rota Bulò, Peter Kontschieder

Keywords Paper

multi-object tracking and segmentation, mots, object tracking, instance segmentation, automatic annotations, deep learning

0

0

0

0

1:01

26/04/2020

On the Relationship between Self-Attention and Convolutional Layers

Jean-Baptiste Cordonnier, Andreas Loukas, Martin Jaggi

Keywords Paper

self-attention, attention, transformers, convolution, CNN, image, expressivity, capacity

0

0

0

0

5:18

14/06/2020

Memory Aggregation Networks for Efficient Interactive Video Object Segmentation

Jiaxu Miao, Yunchao Wei, Yi Yang

Keywords Paper

interactive video object segmentation, pixel embedding learning, memory aggregation networks

0

0

0

0

0:59

14/06/2020

Video to Events: Recycling Video Datasets for Event Cameras

Daniel Gehrig, Mathias Gehrig, Javier Hidalgo-Carrió, Davide Scaramuzza

Keywords Paper

event camera, video, neuromorphic, low-level vision, frame interpolation, generative modelling

0

0

0

0

1:00

07/09/2020

Refinement of Boundary Regression Using Uncertainty in Temporal Action Localization

Yunze Chen, Mengjuan Chen, Rui Wu and
Jiagang Zhu, Zheng Zhu, Qingyi Gu

Keywords Paper

Temporal Action Localization, Temporal Action Detection, Activity recognition and understanding

0

0

0

0

5:09

02/02/2021

Temporal ROI Align for Video Object Recognition

Tao Gong, Kai Chen, Xinjiang Wang and
Qi Chu, Feng Zhu, Dahua Lin, Nenghai Yu, Huamin Feng

Keywords Paper

0

0

0

0

14:29

06/12/2021

Associating Objects with Transformers for Video Object Segmentation

Zongxin Yang, Yunchao Wei, Yi Yang

Keywords Paper

transformers

0

0

0

0

12:29

26/04/2020

Training binary neural networks with real-to-binary convolutions

Brais Martinez, Jing Yang, Adrian Bulat, Georgios Tzimiropoulos

Keywords Paper

binary networks

0

0

0

0

4:41