Compositional Scene Modeling with Global Object-Centric Representations

11/21/2022
by   Tonglin Chen, et al.
0

The appearance of the same object may vary in different scene images due to perspectives and occlusions between objects. Humans can easily identify the same object, even if occlusions exist, by completing the occluded parts based on its canonical image in the memory. Achieving this ability is still a challenge for machine learning, especially under the unsupervised learning setting. Inspired by such an ability of humans, this paper proposes a compositional scene modeling method to infer global representations of canonical images of objects without any supervision. The representation of each object is divided into an intrinsic part, which characterizes globally invariant information (i.e. canonical representation of an object), and an extrinsic part, which characterizes scene-dependent information (e.g., position and size). To infer the intrinsic representation of each object, we employ a patch-matching strategy to align the representation of a potentially occluded object with the canonical representations of objects, and sample the most probable canonical representation based on the category of object determined by amortized variational inference. Extensive experiments are conducted on four object-centric learning benchmarks, and experimental results demonstrate that the proposed method not only outperforms state-of-the-arts in terms of segmentation and reconstruction, but also achieves good global object identification performance.

READ FULL TEXT

page 6

page 7

page 12

page 16

page 17

page 18

page 19

page 20

research
12/07/2021

Unsupervised Learning of Compositional Scene Representations from Multiple Unspecified Viewpoints

Visual scenes are extremely rich in diversity, not only because there ar...
research
05/08/2022

Unsupervised Discovery and Composition of Object Light Fields

Neural scene representations, both continuous and discrete, have recentl...
research
06/11/2020

Learning to Infer 3D Object Models from Images

A crucial ability of human intelligence is to build up models of individ...
research
04/01/2020

Object-Centric Image Generation with Factored Depths, Locations, and Appearances

We present a generative model of images that explicitly reasons over the...
research
03/01/2019

Multi-Object Representation Learning with Iterative Variational Inference

Human perception is structured around objects which form the basis for o...
research
10/11/2022

Robust and Controllable Object-Centric Learning through Energy-based Models

Humans are remarkably good at understanding and reasoning about complex ...
research
11/18/2019

Localizing Occluders with Compositional Convolutional Networks

Compositional convolutional networks are generative compositional models...

Please sign up or login with your details

Forgot password? Click here to reset