Improving Generative Imagination in Object-Centric World Models

10/05/2020
by   Zhixuan Lin, et al.
12

The remarkable recent advances in object-centric generative world models raise a few questions. First, while many of the recent achievements are indispensable for making a general and versatile world model, it is quite unclear how these ingredients can be integrated into a unified framework. Second, despite using generative objectives, abilities for object detection and tracking are mainly investigated, leaving the crucial ability of temporal imagination largely under question. Third, a few key abilities for more faithful temporal imagination such as multimodal uncertainty and situation-awareness are missing. In this paper, we introduce Generative Structured World Models (G-SWM). The G-SWM achieves the versatile world modeling not only by unifying the key properties of previous models in a principled framework but also by achieving two crucial new abilities, multimodal uncertainty and situation-awareness. Our thorough investigation on the temporal generation ability in comparison to the previous models demonstrates that G-SWM achieves the versatility with the best or comparable performance for all experiment settings including a few complex settings that have not been tested before.

READ FULL TEXT

page 6

page 8

page 18

page 19

page 20

page 21

research
10/23/2020

Generative Neurosymbolic Machines

Reconciling symbolic and distributed representations is a crucial challe...
research
05/18/2023

SlotDiffusion: Object-Centric Generative Modeling with Diffusion Models

Object-centric learning aims to represent visual data with a set of obje...
research
03/20/2023

Object-Centric Slot Diffusion

Despite remarkable recent advances, making object-centric learning work ...
research
05/23/2023

Provably Learning Object-Centric Representations

Learning structured representations of the visual world in terms of obje...
research
04/27/2023

ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System

Existing deep video models are limited by specific tasks, fixed input-ou...
research
08/01/2023

Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks

The modern pervasiveness of large-scale deep neural networks (NNs) is dr...
research
01/23/2013

SPOOK: A System for Probabilistic Object-Oriented Knowledge Representation

In previous work, we pointed out the limitations of standard Bayesian ne...

Please sign up or login with your details

Forgot password? Click here to reset