26/04/2020

Structured Object-Aware Physics Prediction for Video Modeling and Planning

Jannik Kossen, Karl Stelzner, Marcel Hussing, Claas Voelcker, Kristian Kersting

Keywords: self-supervised learning, probabilistic deep learning, structured models, video prediction, physics prediction, planning, variational auteoncoders, model-based reinforcement learning, VAEs, unsupervised, variational, graph neural networks, tractable probabilistic models, attend-infer-repeat, relational learning, AIR, sum-product networks, object-oriented, object-centric, object-aware, MCTS

Abstract: When humans observe a physical system, they can easily locate components, understand their interactions, and anticipate future behavior, even in settings with complicated and previously unseen interactions. For computers, however, learning such models from videos in an unsupervised fashion is an unsolved research problem. In this paper, we present STOVE, a novel state-space model for videos, which explicitly reasons about objects and their positions, velocities, and interactions. It is constructed by combining an image model and a dynamics model in compositional manner and improves on previous work by reusing the dynamics model for inference, accelerating and regularizing training. STOVE predicts videos with convincing physical behavior over hundreds of timesteps, outperforms previous unsupervised models, and even approaches the performance of supervised baselines. We further demonstrate the strength of our model as a simulator for sample efficient model-based control, in a task with heavily interacting objects.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICLR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers