Self-Supervised Multi-Object Tracking with Cross-input Consistency

06/12/2021

Self-Supervised Multi-Object Tracking with Cross-input Consistency

Favyen Bastani, Songtao He, Samuel Madden

Keywords: self-supervised learning

Abstract Paper Similar Papers

Abstract: In this paper, we propose a self-supervised learning procedure for training a robust multi-object tracking (MOT) model given only unlabeled video. While several self-supervisory learning signals have been proposed in prior work on single-object tracking, such as color propagation and cycle-consistency, these signals are not effective for training RNN models, which are needed to achieve accurate MOT: they yield degenerate models that, for instance, always match new detections to tracks with the closest initial detections. We propose a novel self-supervisory signal that we call cross-input consistency: we construct two distinct inputs for the same sequence of video, by hiding different information about the sequence in each input. We then compute tracks in that sequence by applying an RNN model independently on each input, and train the model to produce consistent tracks across the two inputs. We evaluate our unsupervised method on MOT17 and KITTI --- remarkably, we find that, despite training only on unlabeled video, our unsupervised approach outperforms four supervised methods published in the last 1--2 years, including Tracktor++, FAMNet, GSM, and mmMOT.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

The Autoencoding Variational Autoencoder

Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy Dvijotham and
Sven Gowal, Pushmeet Kohli

Keywords Paper

0

0

0

0

3:32

05/01/2021

Self-Supervised Training for Blind Multi-Frame Video Denoising

Valery Dewil, Jeremy Anger, Axel Davy and
Thibaud Ehret, Gabriele Facciolo, Pablo Arias

Keywords Paper

0

0

0

0

5:02

22/11/2021

Deep Video Decaptioning

Pengpeng Chu, Weize Quan, Tong Wang and
Pan Wang, Peiran Ren, Dong-Ming Yan

Keywords Paper

video decaptioning, caption mask extraction, frame attention, real time

0

0

0

0

2:59

02/02/2021

RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning

Peihao Chen, Deng Huang, Dongliang He and
Xiang Long, Runhao Zeng, Shilei Wen, Mingkui Tan, Chuang Gan

Keywords Paper

0

0

0

0

14:14

07/09/2020

Procedure Completion by Learning from Partial Summaries

Ehsan Elhamifar, Zwe Naing

Keywords Paper

procedure learning, instructional videos, summarization, subset selection, representation learning, partial summaries

0

0

0

0

7:34

06/12/2020

Robust Disentanglement of a Few Factors at a Time

Benjamin Estermann, Markus Marks, Mehmet Fatih Yanik

Keywords Paper

0

0

0

0

3:22

02/02/2021

Weakly-supervised Temporal Action Localization by Uncertainty Modeling

Pilhyeon Lee, Jinglu Wang, Yan Lu, Hyeran Byun

Keywords Paper

0

0

0

0

14:01

22/11/2021

Few-Shot Temporal Action Localization with Query Adaptive Transformer

Sauradip Nag, Xiatian Zhu, Tao Xiang

Keywords Paper

temporal action localization, few shot learning, transformer, class imbalance, meta learning, action detection

0

0

0

0

2:56

02/02/2021

Weakly Supervised Temporal Action Localization Through Learning Explicit Subspaces for Action and Context

Ziyi Liu, Le Wang, Wei Tang and
Junsong Yuan, Nanning Zheng, Gang Hua

Keywords Paper

0

0

0

0

19:49

14/06/2020

ActionBytes: Learning From Trimmed Videos to Localize Actions

Mihir Jain, Amir Ghodrati, Cees G. M. Snoek

Keywords Paper

action localization, weakly-supervised, self-supervised learning, action proposals, zero-shot, thumos14, activitynet, multithumos, self-training, temporal segmentation

0

0

0

0

1:01

06/12/2021

Reformulating Zero-shot Action Recognition for Multi-label Actions

Alec Kerrigan, Kevin Duarte, Yogesh Rawat, Mubarak Shah

Keywords Paper

machine learning, vision

0

0

0

0

15:01

06/12/2021

STEP: Out-of-Distribution Detection in the Presence of Limited In-Distribution Labeled Data

Zhi Zhou, Lan-Zhe Guo, Zhanzhan Cheng and
Yu-Feng Li, Shiliang Pu

Keywords Paper

optimization, semi-supervised learning

0

0

0

0

11:24

14/06/2020

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Runfa Chen, Wenbing Huang, Binghui Huang and
Fuchun Sun, Bin Fang

Keywords Paper

nice-gan, reusing discriminators for encoding, unsupervised image-to-image translation, decoupled training, multi-scale discriminators, adversarial loss, no independent component for encoding, shared layers, residual attention, cyclegan

0

0

0

0

1:01

19/08/2021

Self-Supervised Video Action Localization with Adversarial Temporal Transforms

Guoqiang Gong, Liangfeng Zheng, Wenhao Jiang, Yadong Mu

Keywords Paper

Computer Vision, Action Recognition, Video

0

0

0

0

14:39

06/12/2020

Make One-Shot Video Object Segmentation Efficient Again

Tim Meinhardt, Laura Leal-Taixé

Keywords Paper

0

0

0

0

3:17

14/06/2020

Action Modifiers: Learning From Adverbs in Instructional Videos

Hazel Doughty, Ivan Laptev, Walterio Mayol-Cuevas, Dima Damen

Keywords Paper

vision and language, video understanding, action recognition, action retrieval, instructional videos, weakly-supervised videos, action and behaviour, attributes, attention, adverbs

0

0

0

0

1:01

03/05/2021

Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning

Dong Bok Lee, Dongchan Min, Seanie Lee, Sung Ju Hwang

Keywords Paper

Unsupervised Learning, Variational Autoencoders, Unsupervised Meta-learning, Meta-Learning

0

0

0

0

13:31

14/06/2020

ZSTAD: Zero-Shot Temporal Activity Detection

Lingling Zhang, Xiaojun Chang, Jun Liu and
Minnan Luo, Sen Wang, Zongyuan Ge, Alexander Hauptmann

Keywords Paper

zero-shot learning, temporal activity detetction, r-c3d, super class

0

0

0

0

1:01

05/01/2021

Noisy Concurrent Training for Efficient Learning Under Label Noise

Fahad Sarfraz, Elahe Arani, Bahram Zonooz

Keywords Paper

0

0

0

0

5:00

14/06/2020

Non-Adversarial Video Synthesis With Learned Priors

Abhishek Aich, Akash Gupta, Rameswar Panda and
Rakib Hyder, M. Salman Asif, Amit K. Roy-Chowdhury

Keywords Paper

video synthesis, non-adversarial learning, generative network, latent space, triplet condition, latent space

0

0

0

0

0:58

19/08/2021

Learning Implicit Temporal Alignment for Few-shot Video Classification

Songyang Zhang, Jiale Zhou, Xuming He

Keywords Paper

Computer Vision, Action Recognition, Deep Learning

0

0

0

0

6:20

14/06/2020

SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

Mohsen Fayyaz, Jürgen Gall

Keywords Paper

action segmentation, action recognition, weakly supervised, set

0

0

0

0

1:01

30/11/2020

Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting

Sovan Biswas, Juergen Gall

Keywords Paper

0

0

0

0

10:06

06/12/2020

Improving Generalization in Reinforcement Learning with Mixture Regularization

KAIXIN WANG, Bingyi Kang, Jie Shao, Jiashi Feng

Keywords Paper

0

0

0

1

3:14

06/12/2020

Adversarial Self-Supervised Contrastive Learning

Minseon Kim, Jihoon Tack, Sung Ju Hwang

Keywords Paper

0

0

0

0

3:19

03/05/2021

Self-Supervised Learning of Compressed Video Representations

Youngjae Yu, Sangho Lee, Gunhee Kim, Yale Song

Keywords Paper

self-supervised learning, Compressed videos

0

0

0

0

4:34

02/02/2021

ACSNet: Action-Context Separation Network for Weakly Supervised Temporal Action Localization

Ziyi Liu, Le Wang, Qilin Zhang and
Wei Tang, Junsong Yuan, Nanning Zheng, Gang Hua

Keywords Paper

0

0

0

0

18:34

02/02/2021

A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization

Ashraful Islam, Chengjiang Long, Richard Radke

Keywords Paper

0

0

0

0

16:53

05/12/2020

Systematic generalization on gSCAN with language conditioned embedding

Tong Gao, Qi Huang, Raymond Mooney

Keywords Paper

0

0

0

0

14:19

18/07/2021

On Recovering from Modeling Errors Using Testing Bayesian Networks

Haiying Huang, Adnan Darwiche

Keywords Paper

Probabilistic Methods, Graphical Models

0

0

0

0

5:09

14/06/2020

Straight to the Point: Fast-Forwarding Videos via Reinforcement Learning Using Textual Data

Washington Ramos, Michel Silva, Edson Araujo and
Leandro Soriano Marcolino, Erickson Nascimento

Keywords Paper

video fast-forwarding, vision and language, reinforcement learning, multi-modal embedding, hyperlapse, video processing, video acceleration, textual-visual embedding space, reinforce, instructional videos

0

0

0

0

1:01

14/06/2020

Dense Regression Network for Video Grounding

Runhao Zeng, Haoming Xu, Wenbing Huang and
Peihao Chen, Mingkui Tan, Chuang Gan

Keywords Paper

video grounding, sparse annotations, dense regression, multi-level fusion

0

0

0

0

0:57

14/06/2020

Few-Shot Video Classification via Temporal Alignment

Kaidi Cao, Jingwei Ji, Zhangjie Cao and
Chien-Yi Chang, Juan Carlos Niebles

Keywords Paper

video classification, few-shot learning, action recognition, temporal alignment

0

0

0

0

0:57

07/09/2020

Unsupervised Domain Adaptation for Spatio-Temporal Action Localization

Nakul Agarwal, Yi-Ting Chen, Behzad Dariush, Ming-Hsuan Yang

Keywords Paper

Spatio-Temporal Action Localization, Unsupervised Domain Adaptation, Adversarial Learning, Video Analysis, Deep Learning

0

0

0

0

9:28

14/06/2020

Set-Constrained Viterbi for Set-Supervised Action Segmentation

Jun Li, Sinisa Todorovic

Keywords Paper

weakly supervised learning, action segmentation, set-constrained viterbi

0

0

0

0

1:01

22/11/2021

Deep Motion Blind Video Stabilization

Muhammad Kashif Ali, Sangjoon Yu, Tae Hyun Kim

Keywords Paper

Video Stabilization, Video enhancement, Temporally Consistent Video Generation

0

0

0

0

3:03

02/02/2021

A Case Study of the Shortcut Effects in Visual Commonsense Reasoning

Keren Ye, Adriana Kovashka

Keywords Paper

0

0

0

0

14:26

26/04/2020

Meta-Learning without Memorization

Mingzhang Yin, George Tucker, Mingyuan Zhou and
Sergey Levine, Chelsea Finn

Keywords Paper

meta-learning, memorization, regularization, overfitting, mutually-exclusive

0

0

0

0

5:09

14/06/2020

Self2Self With Dropout: Learning Self-Supervised Denoising From Single Image

Yuhui Quan, Mingqin Chen, Tongyao Pang, Hui Ji

Keywords Paper

image denoising, deep learning, unsupervised learning, self-supervised learning, single-image learning

0

0

0

0

1:01

14/06/2020

Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

Yude Wang, Jie Zhang, Meina Kan and
Shiguang Shan, Xilin Chen

Keywords Paper

weakly supervised semantic segmentation, self-supervision, self-attention

0

0

0

0

4:55