Discovering Objects that Can Move

03/18/2022
by   Zhipeng Bao, et al.
13

This paper studies the problem of object discovery – separating objects from the background without manual labels. Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions. However, by relying on appearance alone, these methods fail to separate objects from the background in cluttered scenes. This is a fundamental limitation since the definition of an object is inherently ambiguous and context-dependent. To resolve this ambiguity, we choose to focus on dynamic objects – entities that can move independently in the world. We then scale the recent auto-encoder based frameworks for unsupervised object discovery from toy synthetic images to complex real-world scenes. To this end, we simplify their architecture, and augment the resulting model with a weak learning signal from general motion segmentation algorithms. Our experiments demonstrate that, despite only capturing a small subset of the objects that move, this signal is enough to generalize to segment both moving and static instances of dynamic objects. We show that our model scales to a newly collected, photo-realistic synthetic dataset with street driving scenarios. Additionally, we leverage ground truth segmentation and flow annotations in this dataset for thorough ablation and evaluation. Finally, our experiments on the real-world KITTI benchmark demonstrate that the proposed approach outperforms both heuristic- and learning-based methods by capitalizing on motion cues.

READ FULL TEXT

page 1

page 4

page 5

page 8

page 11

page 14

page 15

research
03/20/2022

Stochastic Video Prediction with Structure and Motion

While stochastic video prediction models enable future prediction under ...
research
03/27/2023

Object Discovery from Motion-Guided Tokens

Object discovery – separating objects from the background without manual...
research
07/24/2020

Unsupervised Discovery of 3D Physical Objects from Video

We study the problem of unsupervised physical object discovery. Unlike e...
research
09/14/2017

MODNet: Moving Object Detection Network with Motion and Appearance for Autonomous Driving

We propose a novel multi-task learning system that combines appearance a...
research
08/04/2017

Augmented Reality Meets Computer Vision : Efficient Data Generation for Urban Driving Scenes

The success of deep learning in computer vision is based on availability...
research
02/11/2019

Towards Segmenting Everything That Moves

Video analysis is the task of perceiving the world as it changes. Often,...
research
11/19/2021

ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation

There has been a recent surge in methods that aim to decompose and segme...

Please sign up or login with your details

Forgot password? Click here to reset