Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection

03/25/2023
by   Hwanjun Song, et al.
0

Prompt-OVD is an efficient and effective framework for open-vocabulary object detection that utilizes class embeddings from CLIP as prompts, guiding the Transformer decoder to detect objects in both base and novel classes. Additionally, our novel RoI-based masked attention and RoI pruning techniques help leverage the zero-shot classification ability of the Vision Transformer-based CLIP, resulting in improved detection performance at minimal computational cost. Our experiments on the OV-COCO and OVLVIS datasets demonstrate that Prompt-OVD achieves an impressive 21.2 times faster inference speed than the first end-to-end open-vocabulary detection method (OV-DETR), while also achieving higher APs than four two-stage-based methods operating within similar inference time ranges. Code will be made available soon.

READ FULL TEXT

page 4

page 7

page 12

research
03/22/2022

Open-Vocabulary DETR with Conditional Matching

Open-vocabulary object detection, which is concerned with the problem of...
research
03/28/2022

NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge

Novel object captioning aims at describing objects absent from training ...
research
05/11/2023

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

We present Region-aware Open-vocabulary Vision Transformers (RO-ViT) - a...
research
11/23/2022

Open-vocabulary Attribute Detection

Vision-language modeling has enabled open-vocabulary tasks where predict...
research
11/27/2017

Query-Adaptive R-CNN for Open-Vocabulary Object Detection and Retrieval

We address the problem of open-vocabulary object retrieval and localizat...
research
07/21/2022

Focused Decoding Enables 3D Anatomical Detection by Transformers

Detection Transformers represent end-to-end object detection approaches ...
research
03/15/2022

On Hyperbolic Embeddings in 2D Object Detection

Object detection, for the most part, has been formulated in the euclidea...

Please sign up or login with your details

Forgot password? Click here to reset