Learning Long-term Visual Dynamics with Region Proposal Interaction Networks

03/05/2021

Learning Long-term Visual Dynamics with Region Proposal Interaction Networks

Haozhi Qi, Xiaolong Wang, Deepak Pathak, Yi Ma, Jitendra Malik

Keywords: dynamics prediction, physical reasoning, interaction networks

Abstract Paper Similar Papers

Abstract: Learning long-term dynamics models is the key to understanding physical common sense. Most existing approaches on learning dynamics from visual input sidestep long-term predictions by resorting to rapid re-planning with short-term models. This not only requires such models to be super accurate but also limits them only to tasks where an agent can continuously obtain feedback and take action at each step until completion. In this paper, we aim to leverage the ideas from success stories in visual recognition tasks to build object representations that can capture inter-object and object-environment interactions over a long range. To this end, we propose Region Proposal Interaction Networks (RPIN), which reason about each object's trajectory in a latent region-proposal feature space. Thanks to the simple yet effective object representation, our approach outperforms prior methods by a significant margin both in terms of prediction quality and their ability to plan for downstream tasks, and also generalize well to novel environments. Code, pre-trained models, and more visualization results are available at https://haozhi.io/RPIN.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

A Unified Object Motion and Affinity Model for Online Multi-Object Tracking

Junbo Yin, Wenguan Wang, Qinghao Meng and
Ruigang Yang, Jianbing Shen

Keywords Paper

mot, multi-task learning, motion, affinity, attention, online

0

0

0

0

1:03

05/01/2021

Where to Look?: Mining Complementary Image Regions for Weakly Supervised Object Localization

Sadbhavana Babar, Sukhendu Das

Keywords Paper

0

0

0

0

5:01

14/06/2020

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

Xingjia Pan, Yuqiang Ren, Kekai Sheng and
Weiming Dong, Haolei Yuan, Xiaowei Guo, Chongyang Ma, Changsheng Xu

Keywords Paper

object detection, oriented, densely packed, sku110k, feature selection, dynamic, anchor-free

0

0

0

0

5:01

06/12/2021

On Contrastive Representations of Stochastic Processes

Emile Mathieu, Adam Foster, Yee Teh

Keywords Paper

machine learning, meta learning, contrastive learning, representation learning

0

0

0

0

10:59

06/12/2021

Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space

Jiehong Lin, Hongyang Li, Ke Chen and
Jiangbo Lu, Kui Jia

Keywords Paper

vision

0

0

0

0

12:29

14/06/2020

SESS: Self-Ensembling Semi-Supervised 3D Object Detection

Na Zhao, Tat-Seng Chua, Gim Hee Lee

Keywords Paper

3d object detection, semi-supervised learning, self-ensembling technique, point cloud analysis

0

0

0

0

5:01

26/04/2020

SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition

Zhixuan Lin, Yi-Fu Wu, Skand Vishwanath Peri and
Weihao Sun, Gautam Singh, Fei Deng, Jindong Jiang, Sungjin Ahn

Keywords Paper

Generative models, Unsupervised scene representation, Object-oriented representation, spatial attention

0

0

0

0

4:55

26/04/2020

Computation Reallocation for Object Detection

Feng Liang, Chen Lin, Ronghao Guo and
Ming Sun, Wei Wu, Junjie Yan, Wanli Ouyang

Keywords Paper

Neural Architecture Search, Object Detection

0

0

0

0

5:29

16/11/2020

Learning hierarchical relationships for object-goal navigation

Anwesan Pal, Yiding Qiu, Henrik Christensen

Keywords Paper

0

0

0

0

4:55

06/12/2020

Convolutional Tensor-Train LSTM for Spatio-Temporal Learning

Jiahao Su, Wonmin Byeon, Jean Kossaifi and
Furong Huang, Jan Kautz, Anima Anandkumar

Keywords Paper

0

0

0

0

3:29

14/06/2020

Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking

Peiliang Li, Jieqi Shi, Shaojie Shen

Keywords Paper

3d object tracking, stereo cameras, autonomous driving

0

0

0

0

1:01

14/06/2020

Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction

Ruixu Liu, Ju Shen, He Wang and
Chen Chen, Sen-ching Cheung, Vijayan Asari

Keywords Paper

3d human pose, attention mechanism, multi-scale dilation convolution, monocular motion reconstruction

0

0

0

0

5:01

14/06/2020

DUNIT: Detection-Based Unsupervised Image-to-Image Translation

Deblina Bhattacharjee, Seungryong Kim, Guillaume Vizier, Mathieu Salzmann

Keywords Paper

style transfer, object detection, unsupervised image to image translation, domain adaptation

0

0

0

0

1:01

14/06/2020

Towards Better Generalization: Joint Depth-Pose Learning Without PoseNet

Wang Zhao, Shaohui Liu, Yezhi Shu, Yong-Jin Liu

Keywords Paper

monocular depth estimation, self-supervised learning, deep visual odometry, 3d deep learning, multi-task learning

0

0

0

0

1:01

14/06/2020

Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection

Liang Du, Xiaoqing Ye, Xiao Tan and
Jianfeng Feng, Zhenbo Xu, Errui Ding, Shilei Wen

Keywords Paper

3d object detection, domain adaptation, associative recognition, lidar, point cloud, convolutional neural network, autonomous driving

0

0

0

0

1:01

14/06/2020

Hierarchical Scene Coordinate Classification and Regression for Visual Localization

Xiaotian Li, Shuzhe Wang, Yi Zhao and
Jakob Verbeek, Juho Kannala

Keywords Paper

visual localization, camera relocalization, scene coordinate regression

0

0

0

0

1:01

14/06/2020

Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds

Yongming Rao, Jiwen Lu, Jie Zhou

Keywords Paper

point cloud, unsupervised learning, 3d vision, representation learning

0

0

0

0

1:01

06/12/2020

Self-Learning Transformations for Improving Gaze and Head Redirection

Yufeng Zheng, Seonwook Park, Xucong Zhang and
Shalini De Mello, Otmar Hilliges

Keywords Paper

0

0

0

0

3:20

14/06/2020

Learning Fast and Robust Target Models for Video Object Segmentation

Andreas Robinson, Felix Järemo Lawin, Martin Danelljan and
Fahad Shahbaz Khan, Michael Felsberg

Keywords Paper

video object segmentation, semi-supervised

0

0

0

0

4:57

13/04/2021

Learning bijective feature maps for linear ICA

Alexander Camuto, Matthew Willetts, Chris Holmes and
Brooks Paige, Stephen Roberts

Keywords Paper

0

0

0

0

3:02

06/12/2021

Progressive Coordinate Transforms for Monocular 3D Object Detection

Li Wang, Li Zhang, Yi Zhu and
Zhi Zhang, Tong He, Mu Li, Xiangyang Xue

Keywords Paper

vision

0

0

0

0

13:21

14/06/2020

Webly Supervised Knowledge Embedding Model for Visual Reasoning

Wenbo Zheng, Lan Yan, Chao Gou, Fei-Yue Wang

Keywords Paper

visual reasoning, webly supervised learning

0

0

0

0

1:01

14/06/2020

Cascaded Human-Object Interaction Recognition

Tianfei Zhou, Wenguan Wang, Siyuan Qi and
Haibin Ling, Jianbing Shen

Keywords Paper

human-object interaction recognition, cascade reasoning, fine-grained relation segmentation

0

0

0

0

1:01

14/06/2020

Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation

Yangtao Zheng, Di Huang, Songtao Liu, Yunhong Wang

Keywords Paper

transfer learning, unsupervised domain adaptation, object detection, adversarial learning, spatial attention, prototypes

0

0

0

0

1:01

22/11/2021

Knowing What, Where and When to Look: Video Action modelling with Attention

Juan-Manuel Perez-Rua, Brais Martinez, Xiatian Zhu and
Antoine S Toisoul, Victor A Escorcia, Tao Xiang

Keywords Paper

Action recognition, Fine-grained action, video attention, Spatial attention, Channel attention, Temporal attention, Spatio-temporal attention, Feature refinement

0

0

0

0

2:46

07/09/2020

Advancing weakly supervised cross-domain alignment with optimal transport

Siyang Yuan, Ke Bai, Liqun Chen and
Yizhe Zhang, Chenyang Tao, Chunyuan Li, Guoyin Wang, Ricardo Henao, Lawrence Carin Duke

Keywords Paper

Optimal Transport, Cross Domain Alignment

0

0

0

0

10:04

06/12/2020

Promoting Stochasticity for Expressive Policies via a Simple and Efficient Regularization Method

Qi Zhou, Yufei Kuang, Zherui Qiu and
Houqiang Li, Jie Wang

Keywords Paper

0

0

0

0

3:10

30/11/2020

COG: COnsistent data auGmentation for object perception

Zewen He, Rui Wu, Dingqian Zhang

Keywords Paper

0

0

0

0

5:16

03/05/2021

Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting

Yuan Yin, Vincent Le Guen, Jérémie DONA and
Emmanuel d Bezenac, Ibrahim Ayed, Nicolas THOME, patrick gallinari

Keywords Paper

physics, deep learning, hybrid systems, spatio-temporal forecasting, differential equations

0

0

0

0

13:30

30/11/2020

HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning

Axel Barroso-Laguna, Yannick Verdie, Benjamin Busam, Krystian Mikolajczyk

Keywords Paper

0

0

0

0

10:03

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

05/01/2021

Intra-Class Part Swapping for Fine-Grained Image Classification

Lianbo Zhang, Shaoli Huang, Wei Liu

Keywords Paper

0

0

0

0

4:43

02/02/2021

Meta-Transfer Learning for Low-Resource Abstractive Summarization

Yi-Syuan Chen, Hong-Han Shuai

Keywords Paper

0

0

0

0

19:10

05/01/2021

Weakly-Supervised Object Representation Learning for Few-Shot Semantic Segmentation

Xiaowen Ying, Xin Li, Mooi Choo Chuah

Keywords Paper

0

0

0

0

5:00

06/12/2021

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare and
Shafiq Joty, Caiming Xiong, Steven Chu Hong Hoi

Keywords Paper

transformers, vision, representation learning

0

0

0

0

9:40

03/05/2021

Spatially Structured Recurrent Modules

Nasim Rahaman, Anirudh Goyal, Waleed Gondal and
Manuel Wuthrich, Stefan Bauer, Yash Sharma, Yoshua Bengio, Bernhard Schoelkopf

Keywords Paper

spatio-temporal modelling, partially observed environments, recurrent neural networks, modular architectures

0

0

0

0

5:27

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

19/08/2021

Two Birds with One Stone: Series Saliency for Accurate and Interpretable Multivariate Time Series Forecasting

Qingyi Pan, Wenbo Hu, Ning Chen

Keywords Paper

Machine Learning, Explainable/Interpretable Machine Learning, Time-series; Data Streams

0

1

0

0

15:01

30/11/2020

SDP-Net: Scene Flow Based Real-time Object Detection and Prediction from Sequential 3D Point Clouds

Yi Zhang, Yuwen Ye, Zhiyu Xiang, Jiaqi Gu

Keywords Paper

0

0

0

0

9:45

03/05/2021

Auxiliary Task Update Decomposition: The Good, the Bad and the Neutral

Lucio Dery, Yann Dauphin, David Grangier

Keywords Paper

multitask learning, deeplearning, pre-training, gradient decomposition

0

0

0

0

5:22