F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

09/30/2022
by   Weicheng Kuo, et al.
0

We present F-VLM, a simple open-vocabulary object detection method built upon Frozen Vision and Language Models. F-VLM simplifies the current multi-stage training pipeline by eliminating the need for knowledge distillation or detection-tailored pretraining. Surprisingly, we observe that a frozen VLM: 1) retains the locality-sensitive features necessary for detection, and 2) is a strong region classifier. We finetune only the detector head and combine the detector and VLM outputs for each region at inference time. F-VLM shows compelling scaling behavior and achieves +6.5 mask AP improvement over the previous state of the art on novel categories of LVIS open-vocabulary detection benchmark. In addition, we demonstrate very competitive results on COCO open-vocabulary detection benchmark and cross-dataset transfer detection, in addition to significant training speed-up and compute savings. Code will be released.

READ FULL TEXT

page 2

page 4

page 9

page 16

page 18

page 19

research
03/20/2022

Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

Open-vocabulary object detection aims to detect novel object categories ...
research
06/16/2023

Scaling Open-Vocabulary Object Detection

Open-vocabulary object detection has benefited greatly from pretrained v...
research
03/10/2023

Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection

Open-vocabulary object detection aims to provide object detectors traine...
research
12/23/2022

Learning to Detect and Segment for Open Vocabulary Object Detection

Open vocabulary object detection has been greatly advanced by the recent...
research
03/23/2023

Open-Vocabulary Object Detection using Pseudo Caption Labels

Recent open-vocabulary detection methods aim to detect novel objects by ...
research
11/27/2017

Query-Adaptive R-CNN for Open-Vocabulary Object Detection and Retrieval

We address the problem of open-vocabulary object retrieval and localizat...
research
06/22/2022

Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization

Open-vocabulary object detection (OVD) aims to scale up vocabulary size ...

Please sign up or login with your details

Forgot password? Click here to reset