Augmenting Zero-Shot Detection Training with Image Labels

06/12/2023
by   Katharina Kornmeier, et al.
0

Zero-shot detection (ZSD), i.e., detection on classes not seen during training, is essential for real world detection use-cases, but remains a difficult task. Recent research attempts ZSD with detection models that output embeddings instead of direct class labels. To this aim, the output of the detection model must be aligned to a learned embedding space such as CLIP. However, this alignment is hindered by detection data sets which are expensive to produce compared to image classification annotations, and the resulting lack of category diversity in the training data. We address this challenge by leveraging the CLIP embedding space in combination with image labels from ImageNet. Our results show that image labels are able to better align the detector output to the embedding space and thus have a high potential for ZSD. Compared to only training on detection data, we see a significant gain by adding image label data of 3.3 mAP for the 65/15 split on COCO on the unseen classes, i.e., we more than double the gain of related work.

READ FULL TEXT

page 7

page 8

research
08/25/2020

Bias-Awareness for Zero-Shot Learning the Seen and Unseen

Generalized zero-shot learning recognizes inputs from both seen and unse...
research
06/16/2014

Semantic Graph for Zero-Shot Learning

Zero-shot learning aims to classify visual objects without any training ...
research
09/24/2021

ZSD-YOLO: Zero-Shot YOLO Detection using Vision-Language KnowledgeDistillation

Real-world object sampling produces long-tailed distributions requiring ...
research
05/12/2021

Semantic Diversity Learning for Zero-Shot Multi-label Classification

Training a neural network model for recognizing multiple labels associat...
research
03/01/2020

Novelty-Prepared Few-Shot Classification

Few-shot classification algorithms can alleviate the data scarceness iss...
research
04/14/2023

Phantom Embeddings: Using Embedding Space for Model Regularization in Deep Neural Networks

The strength of machine learning models stems from their ability to lear...
research
03/23/2023

Three ways to improve feature alignment for open vocabulary detection

The core problem in zero-shot open vocabulary detection is how to align ...

Please sign up or login with your details

Forgot password? Click here to reset