OG: Equip vision occupancy with instance segmentation and visual grounding

07/12/2023
by   ZiChao Dong, et al.
0

Occupancy prediction tasks focus on the inference of both geometry and semantic labels for each voxel, which is an important perception mission. However, it is still a semantic segmentation task without distinguishing various instances. Further, although some existing works, such as Open-Vocabulary Occupancy (OVO), have already solved the problem of open vocabulary detection, visual grounding in occupancy has not been solved to the best of our knowledge. To tackle the above two limitations, this paper proposes Occupancy Grounding (OG), a novel method that equips vanilla occupancy instance segmentation ability and could operate visual grounding in a voxel manner with the help of grounded-SAM. Keys to our approach are (1) affinity field prediction for instance clustering and (2) association strategy for aligning 2D instance masks and 3D occupancy instances. Extensive experiments have been conducted whose visualization results and analysis are shown below. Our code will be publicly released soon.

READ FULL TEXT

page 4

page 6

research
06/23/2023

OpenMask3D: Open-Vocabulary 3D Instance Segmentation

We introduce the task of open-vocabulary 3D instance segmentation. Tradi...
research
01/02/2023

Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

In this work, we focus on instance-level open vocabulary segmentation, i...
research
09/08/2023

Three Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding

3D visual grounding is the task of localizing the object in a 3D scene w...
research
12/18/2021

3D Instance Segmentation of MVS Buildings

We present a novel framework for instance segmentation of 3D buildings f...
research
06/20/2019

3D Instance Segmentation via Multi-task Metric Learning

We propose a novel method for instance label segmentation of dense 3D vo...
research
05/19/2022

Voxel-informed Language Grounding

Natural language applied to natural 2D images describes a fundamentally ...
research
07/28/2022

DoRO: Disambiguation of referred object for embodied agents

Robotic task instructions often involve a referred object that the robot...

Please sign up or login with your details

Forgot password? Click here to reset