LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections

05/29/2023
by   M. Jehanzeb Mirza, et al.
0

Recently, large-scale pre-trained Vision and Language (VL) models have set a new state-of-the-art (SOTA) in zero-shot visual classification enabling open-vocabulary recognition of potentially unlimited set of categories defined as simple language prompts. However, despite these great advances, the performance of these zeroshot classifiers still falls short of the results of dedicated (closed category set) classifiers trained with supervised fine tuning. In this paper we show, for the first time, how to reduce this gap without any labels and without any paired VL data, using an unlabeled image collection and a set of texts auto-generated using a Large Language Model (LLM) describing the categories of interest and effectively substituting labeled visual instances of those categories. Using our label-free approach, we are able to attain significant performance improvements over the zero-shot performance of the base VL model and other contemporary methods and baselines on a wide variety of datasets, demonstrating absolute improvement of up to 11.7 approach being label-free, we observe 1.3 prompting baselines that do use 5-shot supervision.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2021

Unsupervised Neural Machine Translation with Generative Language Models Only

We show how to derive state-of-the-art unsupervised neural machine trans...
research
01/23/2012

A probabilistic methodology for multilabel classification

Multilabel classification is a relatively recent subfield of machine lea...
research
05/17/2023

CLIP-GCD: Simple Language Guided Generalized Category Discovery

Generalized Category Discovery (GCD) requires a model to both classify k...
research
05/23/2023

Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker

Re-rankers, which order retrieved documents with respect to the relevanc...
research
02/16/2021

FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary

Current models for Word Sense Disambiguation (WSD) struggle to disambigu...
research
06/29/2023

Towards Open-Domain Topic Classification

We introduce an open-domain topic classification system that accepts use...
research
12/01/2022

Improving Zero-Shot Models with Label Distribution Priors

Labeling large image datasets with attributes such as facial age or obje...

Please sign up or login with your details

Forgot password? Click here to reset