ALBA: Reinforcement Learning for Video Object Segmentation

Abstract: We consider the challenging problem of zero-shot video object segmentation (VOS). That is, segmenting and tracking multiple moving objects within a video fully automatically, without any manual initialization. We treat this as a grouping problem by exploiting object proposals and making a joint inference about grouping over both space and time. We propose a network architecture for tractably performing proposal selection and joint grouping. Crucially, we then show how to train this network with reinforcement learning so that it learns to perform the optimal non-myopic sequence of grouping decisions to segment the whole video. Unlike standard supervised techniques, this also enables us to directly optimize for the non-differentiable overlap-based metrics used to evaluate VOS. We show state-of-the-art results on DAVIS-2017 and Youtube-VOS benchmarks.

18/07/2021

action localization, weakly-supervised, self-supervised learning, action proposals, zero-shot, thumos14, activitynet, multithumos, self-training, temporal segmentation

1:01

22/11/2021

ALBA: Reinforcement Learning for Video Object Segmentation

Shreyank Gowda, Panagiotis Eustratiadis, Timothy Hospedales, Laura Sevilla-Lara

Comments

Similar Papers

Compositional Video Synthesis with Action Graphs

Amir Bar, Roei Herzig, Xiaolong Wang and Anna Rohrbach, Gal Chechik, Prof. Darrell, Amir Globerson

Keywords Abstract Paper

Applications, Computer Vision

Searching for Actions on the Hyperbole

Teng Long, Pascal Mettes, Heng Tao Shen, Cees G. M. Snoek

Keywords Abstract Paper

video retrieval, hyperbolic learning, hierarchical, zero-shot learning, action recognition, hyperbolic geometry

Make One-Shot Video Object Segmentation Efficient Again

Tim Meinhardt, Laura Leal-Taixé

Keywords Abstract Paper

Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Template Matching

Xuhua Huang, Jiarui Xu, Yu-Wing Tai, Chi-Keung Tang

Keywords Abstract Paper

video object segmentation, tracking, segmentation, detection, semi-supervised learning

ZSTAD: Zero-Shot Temporal Activity Detection

Lingling Zhang, Xiaojun Chang, Jun Liu and Minnan Luo, Sen Wang, Zongyuan Ge, Alexander Hauptmann

Keywords Abstract Paper

zero-shot learning, temporal activity detetction, r-c3d, super class

Error-Aware Density Isomorphism Reconstruction for Unsupervised Cross-Domain Crowd Counting

Yuhang He, Zhiheng Ma, Xing Wei and Xiaopeng Hong, Wei Ke, Yihong Gong

Keywords Abstract Paper

Self-Supervised Multi-Object Tracking with Cross-input Consistency

Favyen Bastani, Songtao He, Samuel Madden

Keywords Abstract Paper

self-supervised learning

ActionBytes: Learning From Trimmed Videos to Localize Actions

Mihir Jain, Amir Ghodrati, Cees G. M. Snoek

Keywords Abstract Paper

action localization, weakly-supervised, self-supervised learning, action proposals, zero-shot, thumos14, activitynet, multithumos, self-training, temporal segmentation

Unsupervised computation of salient motion maps from the interpretation of a frame-based classification network

Etienne Meunier, Patrick Bouthemy

Keywords Abstract Paper

Motion saliency, motion segmentation, interpretation neural network, LRP

Evolving Losses for Unsupervised Video Representation Learning

AJ Piergiovanni, Anelia Angelova, Michael S. Ryoo

Keywords Abstract Paper

unsupervised, video, represetnation learning, multi-task, multimodal

Unsupervised Co-part Segmentation through Assembly

Qingzhe Gao, Bin Wang, Libin Liu, Baoquan Chen

Keywords Abstract Paper

Applications, Computer Vision

Weakly Supervised Dense Video Captioning via Jointly Usage of Knowledge Distillation and Cross-modal Matching

Bofeng Wu, Guocheng Niu, Jun Yu and Xinyan Xiao, Jian Zhang, Hua Wu

Keywords Abstract Paper

Computer Vision, Language and Vision, Multi-instance; Multi-label; Multi-view learning

Procedure Completion by Learning from Partial Summaries

Ehsan Elhamifar, Zwe Naing

Keywords Abstract Paper

procedure learning, instructional videos, summarization, subset selection, representation learning, partial summaries

Learning to transfer graph embeddings for inductive graph based recommendation

Le Wu, Yonghui Yang, Lei Chen and Defu Lian, Richang Hong, Meng Wang

Keywords Abstract Paper

graph neural network, content based recommendation, inductive graph learning

ROLL: Visual Self-Supervised Reinforcement Learning with Object Reasoning

Yufei Wang, Narasimhan Gautham, Xingyu Lin and Brian Okorn, David Held

Keywords Abstract Paper

Beyond Short-Term Snippet: Video Relation Detection With Spatio-Temporal Global Context

Chenchen Liu, Yang Jin, Kehan Xu and Guoqiang Gong, Yadong Mu

Keywords Abstract Paper

video visual relation detection, visual relation detection, deep learning

Reformulating Zero-shot Action Recognition for Multi-label Actions

Alec Kerrigan, Kevin Duarte, Yogesh Rawat, Mubarak Shah

Keywords Abstract Paper

machine learning, vision

Zero-Shot Action Recognition from Diverse Object-Scene Compositions

Carlo Bretti, Pascal Mettes

Keywords Abstract Paper

action recognition, zero-shot learning, object-scene compositions

Weakly-supervised Temporal Action Localization by Uncertainty Modeling

Pilhyeon Lee, Jinglu Wang, Yan Lu, Hyeran Byun

Keywords Abstract Paper

Meta Learning Backpropagation And Improving It

Louis Kirsch, Jürgen Schmidhuber

Keywords Abstract Paper

deep learning, optimization, generative model, meta learning

Amir Bar, Roei Herzig, Xiaolong Wang and
Anna Rohrbach, Gal Chechik, Prof. Darrell, Amir Globerson

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Lingling Zhang, Xiaojun Chang, Jun Liu and
Minnan Luo, Sen Wang, Zongyuan Ge, Alexander Hauptmann

Keywords Paper

Yuhang He, Zhiheng Ma, Xing Wei and
Xiaopeng Hong, Wei Ke, Yihong Gong

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Bofeng Wu, Guocheng Niu, Jun Yu and
Xinyan Xiao, Jian Zhang, Hua Wu

Keywords Paper

Keywords Paper

Le Wu, Yonghui Yang, Lei Chen and
Defu Lian, Richang Hong, Meng Wang

Keywords Paper

Yufei Wang, Narasimhan Gautham, Xingyu Lin and
Brian Okorn, David Held

Keywords Paper

Chenchen Liu, Yang Jin, Kehan Xu and
Guoqiang Gong, Yadong Mu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Angelos Filos, Clare Lyle, Yarin Gal and
Sergey Levine, Natasha Jaques, Gregory Farquhar

Keywords Paper

Keywords Paper

Keywords Paper

Zhuoqian Yang, Wentao Zhu, Wayne Wu and
Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy

Keywords Paper

Keywords Paper

Andreas Robinson, Felix Järemo Lawin, Martin Danelljan and
Fahad Shahbaz Khan, Michael Felsberg

Keywords Paper

Fabio Tosi, Filippo Aleotti, Pierluigi Zama Ramirez and
Matteo Poggi, Samuele Salti, Luigi Di Stefano, Stefano Mattoccia

Keywords Paper

Valery Dewil, Jeremy Anger, Axel Davy and
Thibaud Ehret, Gabriele Facciolo, Pablo Arias

Keywords Paper

Paul Voigtlaender, Lishu Luo, Chun Yuan and
Yong Jiang, Bastian Leibe

Keywords Paper

Keywords Paper

Ziyi Liu, Le Wang, Wei Tang and
Junsong Yuan, Nanning Zheng, Gang Hua

Keywords Paper

Li Zhonghong, Yi Yang, She Ying and
Song Jialun, Wu Yukun

Keywords Paper

Wenling Shang, Xiaofei Wang, Aravind Srinivas and
Aravind Rajeswaran, Yang Gao, Pieter Abbeel, Misha Laskin

Keywords Paper

Mikita Dvornik, Isma Hadji, Konstantinos Derpanis and
Animesh Garg, Allan Jepson

Keywords Paper

Keywords Paper

Tianyi Chen, Bo Ji, Tianyu Ding and
Biyi Fang, Guanyi Wang, Zhihui Zhu, Luming Liang, Yixin Shi, Sheng Yi, Xiao Tu

Keywords Paper