Instance Segmentation with Cross-Modal Consistency

10/14/2022
by   Alex Zihao Zhu, et al.
0

Segmenting object instances is a key task in machine perception, with safety-critical applications in robotics and autonomous driving. We introduce a novel approach to instance segmentation that jointly leverages measurements from multiple sensor modalities, such as cameras and LiDAR. Our method learns to predict embeddings for each pixel or point that give rise to a dense segmentation of the scene. Specifically, our technique applies contrastive learning to points in the scene both across sensor modalities and the temporal domain. We demonstrate that this formulation encourages the models to learn embeddings that are invariant to viewpoint variations and consistent across sensor modalities. We further demonstrate that the embeddings are stable over time as objects move around the scene. This not only provides stable instance masks, but can also provide valuable signals to downstream tasks, such as object tracking. We evaluate our method on the Cityscapes and KITTI-360 datasets. We further conduct a number of ablation studies, demonstrating benefits when applying additional inputs for the contrastive loss.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

research
06/07/2023

RefineVIS: Video Instance Segmentation with Temporal Attention Refinement

We introduce a novel framework called RefineVIS for Video Instance Segme...
research
06/07/2023

Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion

Instance segmentation in 3D is a challenging task due to the lack of lar...
research
04/19/2023

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

Perception systems in modern autonomous driving vehicles typically take ...
research
08/01/2023

MonoNext: A 3D Monocular Object Detection with ConvNext

Autonomous driving perception tasks rely heavily on cameras as the prima...
research
03/19/2021

Video Class Agnostic Segmentation Benchmark for Autonomous Driving

Semantic segmentation approaches are typically trained on large-scale da...
research
04/29/2023

Sensor Equivariance by LiDAR Projection Images

In this work, we propose an extension of conventional image data by an a...

Please sign up or login with your details

Forgot password? Click here to reset