Text-to-Image Diffusion Models are Zero-Shot Classifiers

03/27/2023
by   Kevin Clark, et al.
0

The excellent generative capabilities of text-to-image diffusion models suggest they learn informative representations of image-text data. However, what knowledge their representations capture is not fully understood, and they have not been thoroughly explored on downstream tasks. We investigate diffusion models by proposing a method for evaluating them as zero-shot classifiers. The key idea is using a diffusion model's ability to denoise a noised image given a text description of a label as a proxy for that label's likelihood. We apply our method to Imagen, using it to probe fine-grained aspects of Imagen's knowledge and comparing it with CLIP's zero-shot abilities. Imagen performs competitively with CLIP on a wide range of zero-shot image classification datasets. Additionally, it achieves state-of-the-art results on shape/texture bias tests and can successfully perform attribute binding while CLIP cannot. Although generative pre-training is prevalent in NLP, visual foundation models often use other methods such as contrastive learning. Based on our findings, we argue that generative pre-training should be explored as a compelling alternative for vision and vision-language problems.

READ FULL TEXT
research
09/03/2023

VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders

Large-scale text-to-image diffusion models have shown impressive capabil...
research
07/18/2023

Augmenting CLIP with Improved Visio-Linguistic Reasoning

Image-text contrastive models such as CLIP are useful for a variety of d...
research
03/28/2023

Your Diffusion Model is Secretly a Zero-Shot Classifier

The recent wave of large-scale text-to-image diffusion models has dramat...
research
02/13/2023

A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models

Contrastively trained text-image models have the remarkable ability to p...
research
01/31/2023

Debiasing Vision-Language Models via Biased Prompts

Machine learning models have been shown to inherit biases from their tra...
research
07/08/2023

Measuring the Success of Diffusion Models at Imitating Human Artists

Modern diffusion models have set the state-of-the-art in AI image genera...

Please sign up or login with your details

Forgot password? Click here to reset