Unsupervised Discovery of 3D Physical Objects from Video

07/24/2020
by   Yilun Du, et al.
7

We study the problem of unsupervised physical object discovery. Unlike existing frameworks that aim to learn to decompose scenes into 2D segments purely based on each object's appearance, we explore how physics, especially object interactions, facilitates learning to disentangle and segment instances from raw videos, and to infer the 3D geometry and position of each object, all without supervision. Drawing inspiration from developmental psychology, our Physical Object Discovery Network (POD-Net) uses both multi-scale pixel cues and physical motion cues to accurately segment observable and partially occluded objects of varying sizes, and infer properties of those objects. Our model reliably segments objects on both synthetic and real scenes. The discovered object properties can also be used to reason about physical events.

READ FULL TEXT

page 2

page 3

page 5

page 6

page 7

page 8

research
06/10/2019

DensePhysNet: Learning Dense Physical Object Representations via Multi-step Dynamic Interactions

We study the problem of learning physical object representations for rob...
research
03/18/2022

Discovering Objects that Can Move

This paper studies the problem of object discovery – separating objects ...
research
07/07/2022

Finding Fallen Objects Via Asynchronous Audio-Visual Integration

The way an object looks and sounds provide complementary reflections of ...
research
02/28/2018

Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions

Common-sense physical reasoning is an essential ingredient for any intel...
research
11/04/2014

Simultaneous Localization, Mapping, and Manipulation for Unsupervised Object Discovery

We present an unsupervised framework for simultaneous appearance-based o...
research
09/13/2018

Physical Primitive Decomposition

Objects are made of parts, each with distinct geometry, physics, functio...
research
06/05/2018

Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects

We present Sequential Attend, Infer, Repeat (SQAIR), an interpretable de...

Please sign up or login with your details

Forgot password? Click here to reset