Unsupervised Motion Representation Learning with Capsule Autoencoders

06/12/2021

Unsupervised Motion Representation Learning with Capsule Autoencoders

Ziwei Xu, Xudong Shen, Yongkang Wong, Mohan Kankanhalli

Keywords: self-supervised learning, representation learning

Abstract Paper Similar Papers

Abstract: We propose the Motion Capsule Autoencoder (MCAE), which addresses a key challenge in the unsupervised learning of motion representations: transformation invariance. MCAE models motion in a two-level hierarchy. In the lower level, a spatio-temporal motion signal is divided into short, local, and semantic-agnostic snippets. In the higher level, the snippets are aggregated to form full-length semantic-aware segments. For both levels, we represent motion with a set of learned transformation invariant templates and the corresponding geometric transformations by using capsule autoencoders of a novel design. This leads to a robust and efficient encoding of viewpoint changes. MCAE is evaluated on a novel Trajectory20 motion dataset and various real-world skeleton-based human action datasets. Notably, it achieves better results than baselines on Trajectory20 with considerably fewer parameters and state-of-the-art performance on the unsupervised skeleton-based action recognition task.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

MAU: A Motion-Aware Unit for Video Prediction and Beyond

Zheng Chang, Xinfeng Zhang, Shanshe Wang and
Siwei Ma, Yan Ye, Xiang Xinguang, Wen Gao

Keywords Paper

vision

0

0

0

0

9:54

02/02/2021

Learning Comprehensive Motion Representation for Action Recognition

Mingyu Wu, Boyuan Jiang, Donghao Luo and
Junchi Yan, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xiaokang Yang

Keywords Paper

0

0

0

0

15:15

26/04/2020

U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation

Junho Kim, Minjae Kim, Hyeonwoo Kang, Kwang Hee Lee

Keywords Paper

Image-to-Image Translation, Generative Attentional Networks, Adaptive Layer-Instance Normalization

0

0

1

1

5:08

14/06/2020

Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction

Ruixu Liu, Ju Shen, He Wang and
Chen Chen, Sen-ching Cheung, Vijayan Asari

Keywords Paper

3d human pose, attention mechanism, multi-scale dilation convolution, monocular motion reconstruction

0

0

0

0

5:01

14/06/2020

Non-Local Neural Networks With Grouped Bilinear Attentional Transforms

Lu Chi, Zehuan Yuan, Yadong Mu, Changhu Wang

Keywords Paper

attention, non-local, bilinear, image classification, video classification, grouped, data-adaptive

0

0

0

0

1:01

30/11/2020

DeepVoxels++: Enhancing the Fidelity of Novel View Synthesis from 3D Voxel Embeddings

Tong He, John Collomosse, Hailin Jin, Stefano Soatto

Keywords Paper

0

0

0

0

7:47

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

22/11/2021

Learning Attention Map for 3D Human Recovery from a Single RGB Image

Peng Xu, Na Jiang, Jun Li, Zhiping Shi

Keywords Paper

3D Human Recover, Human Parsing, Depth Estimation

0

0

0

0

8:14

22/11/2021

Segmenting Invisible Moving Objects

Hala Lamdouar, Weidi Xie, Andrew Zisserman

Keywords Paper

synthetic data generation, motion segmentation, amodal segmentation, video camouflage breaking, self-attention

0

0

0

0

3:05

06/12/2020

Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views

Nanbo Li, Cian Eastwood, Robert Fisher

Keywords Paper

0

0

0

0

3:19

26/04/2020

Controlling generative models with continuous factors of variations

Antoine Plumerault, Hervé Le Borgne, Céline Hudelot

Keywords Paper

Generative models, factor of variation, GAN, beta-VAE, interpretable representation, interpretability

0

0

0

0

5:07

14/06/2020

Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking

Peiliang Li, Jieqi Shi, Shaojie Shen

Keywords Paper

3d object tracking, stereo cameras, autonomous driving

0

0

0

0

1:01

22/11/2021

Bird’s Eye View Segmentation Using Lifted 2D Semantic Features

Isht Dwivedi, Srikanth Malla, Yi-Ting Chen, Behzad Dariush

Keywords Paper

segmentation, bird's eye view, pseudo-lidar, video understanding, autonomous driving, monocular camera, depth estimation

0

0

0

0

3:02

16/11/2020

Amodal 3D Reconstruction for Robotic Manipulation via Stability and Connectivity

William Agnew, Christopher Xie, Aaron Walsman and
Octavian Murad, Yubo Wang, Pedro Domingos, Siddhartha Srinivasa

Keywords Paper

0

0

0

0

4:37

05/01/2021

Coarse-to-Fine Gaze Redirection With Numerical and Pictorial Guidance

Jingjing Chen, Jichao Zhang, Enver Sangineto and
Tao Chen, Jiayuan Fan, Nicu Sebe

Keywords Paper

0

0

0

0

4:34

14/06/2020

3DV: 3D Dynamic Voxel for Action Recognition in Depth Video

Yancheng Wang, Yang Xiao, Fu Xiong and
Wenxiang Jiang, Zhiguo Cao, Joey Tianyi Zhou, Junsong Yuan

Keywords Paper

3d action recognition, point cloud, 3d motion, temporal rank pooling, pointnet++, multi-stream network

0

0

0

0

1:01

14/06/2020

ASLFeat: Learning Local Features of Accurate Shape and Localization

Zixin Luo, Lei Zhou, Xuyang Bai and
Hongkai Chen, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, Long Quan

Keywords Paper

image matching, local feature keypoints, local feature descriptors, deep learning

0

0

0

0

1:01

22/11/2021

Paying Attention to Varying Receptive Fields: Object Detection with Atrous Filters and Vision Transformers

Arthur Jian Shun Lam, Jun Yi Lim, Ricky Sutopo, Vishnu Monn Baskaran

Keywords Paper

object detection, atrous convolution, vision transformers, attention mechanism

0

0

0

0

3:01

14/06/2020

Blur Aware Calibration of Multi-Focus Plenoptic Camera

Mathieu Labussière, Céline Teulière, Frédéric Bernardin, Omar Ait-Aider

Keywords Paper

calibration, plenoptic camera, multi-focus, blur circle, micro-lenses array, light field

0

0

0

0

4:56

02/02/2021

DecAug: Augmenting HOI Detection via Decomposition

Hao-Shu Fang, Yichen Xie, Dian Shao and
Yong-Lu Li, Cewu Lu

Keywords Paper

0

0

0

0

9:02

12/07/2020

ControlVAE: Controllable Variational Autoencoder

Huajie Shao, Shuochao Yao, Dachun Sun and
Aston Zhang, Shengzhong Liu, Dongxin Liu, Jun Wang, Tarek Abdelzaher

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

14:22

14/06/2020

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

Shaoshuai Shi, Chaoxu Guo, Li Jiang and
Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li

Keywords Paper

3d object detection, point cloud, 3d scene understanding, lidar, autonomous driving, kitti dataset, waymo open dataset

0

0

0

0

1:01

05/01/2021

Structured Visual Search via Composition-Aware Learning

Mert Kilickaya, Arnold W.M. Smeulders

Keywords Paper

0

0

0

0

0:44

22/11/2021

Unsupervised View-Invariant Human Posture Representation

Faegheh Sardari, Bjorn Ommer, Majid Mirmehdi

Keywords Paper

Representation Learning, Self-supervised Learning, Unsupervised 3D Pose Estimation, View-Invariant Pose Estimation, View-Invariant Action Recognition, View-Invariant Action Assessment, View-Invariant Human Movemnet Assessment, Human Posture Representation, Unsupervised Action Recognition, Unsupervised Action Assessment

0

0

0

0

2:59

22/11/2021

Temporal Meta-Adaptor for Video Object Detection

Chi Wang, Yang Hua, ZHENG LU and
Jian Gao, Neil Robertson

Keywords Paper

video object detection, temporal aggregation, meta-learning, ImageNet VID

0

0

0

0

6:58

02/02/2021

Object-Centric Image Generation from Layouts

Tristan Sylvain, Pengchuan Zhang, Yoshua Bengio and
R Devon Hjelm, Shikhar Sharma

Keywords Paper

0

0

0

0

17:44

06/12/2020

Memory-Efficient Learning of Stable Linear Dynamical Systems for Prediction and Control

Giorgos Mamakoukas, Orest Xherija, Todd Murphey

Keywords Paper

Optimization -> Non-Convex Optimization, Optimization -> Stochastic Optimization

0

0

0

0

3:13

07/09/2020

DESC: Domain Adaptation for Depth Estimation via Semantic Consistency

Adrian Lopez-Rodriguez, Krystian Mikolajczyk

Keywords Paper

domain adaptation, depth estimation, monocular, depth, domain, KITTI, Virtual KITTI

0

0

0

0

9:58

14/06/2020

Learning to Manipulate Individual Objects in an Image

Yanchao Yang, Yutong Chen, Stefano Soatto

Keywords Paper

representation learning, disentangled, spatial disentanglement, unsupervised, spatially localized, object-centric, scene manipulation, independent factors, controllable factors, multiple objects

0

0

0

0

1:01

14/06/2020

A U-Net Based Discriminator for Generative Adversarial Networks

Edgar Schönfeld, Bernt Schiele, Anna Khoreva

Keywords Paper

gan, image synthesis, u-net, discriminator, consistency regularization, equivariance, generative adversarial networks, ffhq, biggan

0

0

0

0

1:01

07/09/2020

Explicit Residual Descent for 3D Human Pose Estimation from 2D Joint Locations

Yangyuxuan Kang, Anbang Yao, Shandong Wang and
Ming Lu, Yurong Chen, Enhua Wu

Keywords Paper

3D human pose estimation, pose lifting network, feedback optimization, deep neural network, supervised learning

0

0

0

0

6:02

22/11/2021

Attention to Action: Leveraging Attention for Object Navigation

Shi Chen, Qi Zhao

Keywords Paper

Object-goal Navigation, Attention, Visual Navigation

0

0

0

0

2:51

14/06/2020

On Joint Estimation of Pose, Geometry and svBRDF From a Handheld Scanner

Carolin Schmitt, Simon Donné, Gernot Riegler and
Vladlen Koltun, Andreas Geiger

Keywords Paper

3d reconstruction, mobile lightstage, mulitview photometric stereo, svbrdf estimation, shape from shading, material segmentation, handheld 3d sensor, non-lambertian surfaces

0

0

0

0

1:01

19/08/2021

Context-Aware Image Inpainting with Learned Semantic Priors

Wendong Zhang, Junwei Zhu, Ying Tai and
Yunbo Wang, Wenqing Chu, Bingbing Ni, Chengjie Wang, Xiaokang Yang

Keywords Paper

Computer Vision, 2D and 3D Computer Vision, Deep Learning

0

0

0

0

13:26

02/02/2021

Learning Intact Features by Erasing-Inpainting for Few-shot Classification

Junjie Li, Zilei Wang, Xiaoming Hu

Keywords Paper

0

0

0

0

15:15

26/08/2020

Bayesian Image Classification with Deep Convolutional Gaussian Processes

Vincent Dutordoir, Mark van der Wilk, Artem Artemev, James Hensman

Keywords Paper

0

0

0

0

16:29

22/11/2021

Monocular Arbitrary Moving Object Discovery and Segmentation

Michal Neoral, Jan Sochman, Jiri Matas

Keywords Paper

motion segmentation, instance motion segmentation

0

0

0

0

2:55

14/06/2020

Blurry Video Frame Interpolation

Wang Shen, Wenbo Bao, Guangtao Zhai and
Li Chen, Xiongkuo Min, Zhiyong Gao

Keywords Paper

video frame interpolation, frame-rate up-conversion, video deblurring, pyramid framework, spatial and temporal optimization

0

0

0

0

5:01

14/06/2020

Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline

Yu-Lun Liu, Wei-Sheng Lai, Yu-Sheng Chen and
Yi-Lung Kao, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang

Keywords Paper

high dynamic range, inverse tone mapping, image sensor, dynamic range, camera response function, quantization, computational photography, deep learning, convolutional neural network, computer vision

0

0

0

0

1:01

06/12/2021

Learnable Fourier Features for Multi-dimensional Spatial Positional Encoding

Yang Li, Si Si, Gang Li and
Cho-Jui Hsieh, Samy Bengio

Keywords Paper

machine learning, transformers, vision

0

0

0

0

10:54