Robust and Controllable Object-Centric Learning through Energy-based Models

10/11/2022
by   Ruixiang Zhang, et al.
12

Humans are remarkably good at understanding and reasoning about complex visual scenes. The capability to decompose low-level observations into discrete objects allows us to build a grounded abstract representation and identify the compositional structure of the world. Accordingly, it is a crucial step for machine learning models to be capable of inferring objects and their properties from visual scenes without explicit supervision. However, existing works on object-centric representation learning either rely on tailor-made neural network modules or strong probabilistic assumptions in the underlying generative and inference processes. In this work, we present , a conceptually simple and general approach to learning object-centric representations through an energy-based model. By forming a permutation-invariant energy function using vanilla attention blocks readily available in Transformers, we can infer object-centric latent variables via gradient-based MCMC methods where permutation equivariance is automatically guaranteed. We show that can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations, leading to better segmentation accuracy and competitive downstream task performance. Further, empirical evaluations show that 's learned representations are robust against distribution shift. Finally, we demonstrate the effectiveness of in systematic compositional generalization, by re-composing learned energy functions for novel scene generation and manipulation.

READ FULL TEXT

page 7

page 8

page 17

page 18

page 19

page 21

research
05/23/2023

Provably Learning Object-Centric Representations

Learning structured representations of the visual world in terms of obje...
research
07/30/2019

GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations

Generative models are emerging as promising tools in robotics and reinfo...
research
06/14/2023

OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments

Cognitive science and psychology suggest that object-centric representat...
research
08/28/2023

RobustCLEVR: A Benchmark and Framework for Evaluating Robustness in Object-centric Learning

Object-centric representation learning offers the potential to overcome ...
research
12/15/2021

Object Pursuit: Building a Space of Objects via Discriminative Weight Generation

We propose a framework to continuously learn object-centric representati...
research
11/21/2022

Compositional Scene Modeling with Global Object-Centric Representations

The appearance of the same object may vary in different scene images due...
research
04/25/2023

On the Generalization of Learned Structured Representations

Despite tremendous progress over the past decade, deep learning methods ...

Please sign up or login with your details

Forgot password? Click here to reset