Panoptic Lifting for 3D Scene Understanding with Neural Fields

by   Yawar Siddiqui, et al.

We propose Panoptic Lifting, a novel approach for learning panoptic 3D volumetric representations from images of in-the-wild scenes. Once trained, our model can render color images together with 3D-consistent panoptic segmentation from novel viewpoints. Unlike existing approaches which use 3D input directly or indirectly, our method requires only machine-generated 2D panoptic segmentation masks inferred from a pre-trained network. Our core contribution is a panoptic lifting scheme based on a neural field representation that generates a unified and multi-view consistent, 3D panoptic representation of the scene. To account for inconsistencies of 2D instance identifiers across views, we solve a linear assignment with a cost based on the model's current predictions and the machine-generated segmentation masks, thus enabling us to lift 2D instances to 3D in a consistent way. We further propose and ablate contributions that make our method more robust to noisy, machine-generated labels, including test-time augmentations for confidence estimates, segment consistency loss, bounded segmentation fields, and gradient stopping. Experimental results validate our approach on the challenging Hypersim, Replica, and ScanNet datasets, improving by 8.4, 13.8, and 10.6 PQ over state of the art.


page 1

page 2

page 4

page 5

page 7

page 8


Neural Volumetric Object Selection

We introduce an approach for selecting objects in neural volumetric 3D r...

Instance Neural Radiance Field

This paper presents one of the first learning-based NeRF 3D instance seg...

Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates

Neural radiance field is an emerging rendering method that generates hig...

Unsupervised Multi-View Object Segmentation Using Radiance Field Propagation

We present radiance field propagation (RFP), a novel approach to segment...

StylizedNeRF: Consistent 3D Scene Stylization as Stylized NeRF via 2D-3D Mutual Learning

3D scene stylization aims at generating stylized images of the scene fro...

Fusing RGBD Tracking and Segmentation Tree Sampling for Multi-Hypothesis Volumetric Segmentation

Despite rapid progress in scene segmentation in recent years, 3D segment...

PAg-NeRF: Towards fast and efficient end-to-end panoptic 3D representations for agricultural robotics

Precise scene understanding is key for most robot monitoring and interve...

Please sign up or login with your details

Forgot password? Click here to reset