Contrastive Language-Image Pretrained (CLIP) Models are Powerful Out-of-Distribution Detectors

03/10/2023
by   Felix Michels, et al.
0

We present a comprehensive experimental study on pretrained feature extractors for visual out-of-distribution (OOD) detection. We examine several setups, based on the availability of labels or image captions and using different combinations of in- and out-distributions. Intriguingly, we find that (i) contrastive language-image pretrained models achieve state-of-the-art unsupervised out-of-distribution performance using nearest neighbors feature similarity as the OOD detection score, (ii) supervised state-of-the-art OOD detection performance can be obtained without in-distribution fine-tuning, (iii) even top-performing billion-scale vision transformers trained with natural language supervision fail at detecting adversarially manipulated OOD images. Finally, we argue whether new benchmarks for visual anomaly detection are needed based on our experiments. Using the largest publicly available vision transformer, we achieve state-of-the-art performance across all 18 reported OOD benchmarks, including an AUROC of 87.6% (9.2% gain, unsupervised) and 97.4% (1.2% gain, supervised) for the challenging task of CIFAR100 → CIFAR10 OOD detection. The code will be open-sourced.

READ FULL TEXT

page 5

page 8

page 14

page 15

research
06/09/2021

Sentence Embeddings using Supervised Contrastive Learning

Sentence embeddings encode sentences in fixed dense vectors and have pla...
research
03/22/2023

JaCoText: A Pretrained Model for Java Code-Text Generation

Pretrained transformer-based models have shown high performance in natur...
research
08/21/2023

SupEuclid: Extremely Simple, High Quality OoD Detection with Supervised Contrastive Learning and Euclidean Distance

Out-of-Distribution (OoD) detection has developed substantially in the p...
research
04/18/2021

Contrastive Out-of-Distribution Detection for Pretrained Transformers

Pretrained transformers achieve remarkable performance when the test dat...
research
03/31/2023

Exploring the Limits of Deep Image Clustering using Pretrained Models

We present a general methodology that learns to classify images without ...
research
04/24/2017

Using Global Constraints and Reranking to Improve Cognates Detection

Global constraints and reranking have not been used in cognates detectio...
research
06/16/2021

Probing Image-Language Transformers for Verb Understanding

Multimodal image-language transformers have achieved impressive results ...

Please sign up or login with your details

Forgot password? Click here to reset