GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations

07/30/2019
by   Martin Engelcke, et al.
3

Generative models are emerging as promising tools in robotics and reinforcement learning. Yet, even though tasks in these domains typically involve distinct objects, most state-of-the-art methods do not explicitly capture the compositional nature of visual scenes. Two exceptions, MONet and IODINE, decompose scenes into objects in an unsupervised fashion via a set of latent variables. Their underlying generative processes, however, do not account for component interactions. Hence, neither of them allows for principled sampling of coherent scenes. Here we present GENESIS, the first object-centric generative model of visual scenes capable of both decomposing and generating complete scenes by explicitly capturing relationships between scene components. GENESIS parameterises a spatial GMM over pixels which is encoded by component-wise latent variables that are inferred sequentially or sampled from an autoregressive prior. We train GENESIS on two publicly available datasets and probe the information in the latent representations through a set of classification tasks, outperforming several baselines.

READ FULL TEXT

page 6

page 7

page 12

page 13

page 14

page 15

research
10/11/2022

Robust and Controllable Object-Centric Learning through Energy-based Models

Humans are remarkably good at understanding and reasoning about complex ...
research
04/11/2020

Learning to Manipulate Individual Objects in an Image

We describe a method to train a generative model with latent factors tha...
research
04/27/2020

Towards causal generative scene models via competition of experts

Learning how to model complex scenes in a modular way with recombinable ...
research
07/02/2020

RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces

We present RELATE, a model that learns to generate physically plausible ...
research
03/21/2022

Generating Fast and Slow: Scene Decomposition via Reconstruction

We consider the problem of segmenting scenes into constituent entities, ...
research
10/28/2019

Entity Abstraction in Visual Model-Based Reinforcement Learning

This paper tests the hypothesis that modeling a scene in terms of entiti...
research
01/21/2023

Time-Conditioned Generative Modeling of Object-Centric Representations for Video Decomposition and Prediction

When perceiving the world from multiple viewpoints, humans have the abil...

Please sign up or login with your details

Forgot password? Click here to reset