Going Denser with Open-Vocabulary Part Segmentation

05/18/2023
by   Peize Sun, et al.
0

Object detection has been expanded from a limited number of categories to open vocabulary. Moving forward, a complete intelligent vision system requires understanding more fine-grained object descriptions, object parts. In this paper, we propose a detector with the ability to predict both open-vocabulary objects and their part segmentation. This ability comes from two designs. First, we train the detector on the joint of part-level, object-level and image-level data to build the multi-granularity alignment between language and image. Second, we parse the novel object into its parts by its dense semantic correspondence with the base object. These two designs enable the detector to largely benefit from various data sources and foundation models. In open-vocabulary part segmentation experiments, our method outperforms the baseline by 3.3∼7.3 mAP in cross-dataset generalization on PartImageNet, and improves the baseline by 7.3 novel AP_50 in cross-category generalization on Pascal Part. Finally, we train a detector that generalizes to a wide range of part segmentation datasets while achieving better performance than dataset-specific training.

READ FULL TEXT

page 1

page 3

page 5

page 8

page 9

page 15

page 19

page 20

research
11/27/2022

Learning Object-Language Alignments for Open-Vocabulary Object Detection

Existing object detection methods are bounded in a fixed-set vocabulary ...
research
09/18/2023

Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection

Point cloud-based open-vocabulary 3D object detection aims to detect 3D ...
research
04/03/2023

Open-Vocabulary Point-Cloud Object Detection without 3D Annotation

The goal of open-vocabulary detection is to identify novel objects based...
research
05/23/2023

3D Open-vocabulary Segmentation with Foundation Models

Open-vocabulary segmentation of 3D scenes is a fundamental function of h...
research
03/21/2023

Detecting Everything in the Open World: Towards Universal Object Detection

In this paper, we formally address universal object detection, which aim...
research
08/01/2023

Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding

Open-world instance-level scene understanding aims to locate and recogni...
research
01/07/2022

Detecting Twenty-thousand Classes using Image-level Supervision

Current object detectors are limited in vocabulary size due to the small...

Please sign up or login with your details

Forgot password? Click here to reset