Object-Centric Representation Learning with Generative Spatial-Temporal Factorization

11/09/2021
by   Li Nanbo, et al.
2

Learning object-centric scene representations is essential for attaining structural understanding and abstraction of complex scenes. Yet, as current approaches for unsupervised object-centric representation learning are built upon either a stationary observer assumption or a static scene assumption, they often: i) suffer single-view spatial ambiguities, or ii) infer incorrectly or inaccurately object representations from dynamic scenes. To address this, we propose Dynamics-aware Multi-Object Network (DyMON), a method that broadens the scope of multi-view object-centric representation learning to dynamic scenes. We train DyMON on multi-view-dynamic-scene data and show that DyMON learns – without supervision – to factorize the entangled effects of observer motions and scene object dynamics from a sequence of observations, and constructs scene object spatial representations suitable for rendering at arbitrary times (querying across time) and from arbitrary viewpoints (querying across space). We also show that the factorized scene representations (w.r.t. objects) support querying about a single object by space and time independently.

READ FULL TEXT

page 7

page 16

page 17

page 20

page 21

page 22

page 23

page 24

research
11/13/2021

Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views

Learning object-centric representations of multi-object scenes is a prom...
research
06/16/2023

OCTScenes: A Versatile Real-World Dataset of Tabletop Scenes for Object-Centric Learning

Humans possess the cognitive ability to comprehend scenes in a compositi...
research
04/30/2023

Object-Centric Voxelization of Dynamic Scenes via Inverse Neural Rendering

Understanding the compositional dynamics of the world in unsupervised 3D...
research
06/07/2022

ObPose: Leveraging Canonical Pose for Object-Centric Scene Inference in 3D

We present ObPose, an unsupervised object-centric generative model that ...
research
10/05/2022

Differentiable Mathematical Programming for Object-Centric Representation Learning

We propose topology-aware feature partitioning into k disjoint partition...
research
05/23/2023

Provably Learning Object-Centric Representations

Learning structured representations of the visual world in terms of obje...
research
04/09/2021

GATSBI: Generative Agent-centric Spatio-temporal Object Interaction

We present GATSBI, a generative model that can transform a sequence of r...

Please sign up or login with your details

Forgot password? Click here to reset