Is it all a cluster game? – Exploring Out-of-Distribution Detection based on Clustering in the Embedding Space

03/16/2022
by   Poulami Sinhamahapatra, et al.
16

It is essential for safety-critical applications of deep neural networks to determine when new inputs are significantly different from the training distribution. In this paper, we explore this out-of-distribution (OOD) detection problem for image classification using clusters of semantically similar embeddings of the training data and exploit the differences in distance relationships to these clusters between in- and out-of-distribution data. We study the structure and separation of clusters in the embedding space and find that supervised contrastive learning leads to well-separated clusters while its self-supervised counterpart fails to do so. In our extensive analysis of different training methods, clustering strategies, distance metrics, and thresholding approaches, we observe that there is no clear winner. The optimal approach depends on the model architecture and selected datasets for in- and out-of-distribution. While we could reproduce the outstanding results for contrastive training on CIFAR-10 as in-distribution data, we find standard cross-entropy paired with cosine similarity outperforms all contrastive training methods when training on CIFAR-100 instead. Cross-entropy provides competitive results as compared to expensive contrastive training methods.

READ FULL TEXT
research
03/24/2021

A Broad Study on the Transferability of Visual Representations with Contrastive Learning

Tremendous progress has been made in visual representation learning, not...
research
09/15/2023

Supervised Stochastic Neighbor Embedding Using Contrastive Learning

Stochastic neighbor embedding (SNE) methods t-SNE, UMAP are two most pop...
research
11/19/2015

Neural network-based clustering using pairwise constraints

This paper presents a neural network-based end-to-end clustering framewo...
research
06/17/2020

LSD-C: Linearly Separable Deep Clusters

We present LSD-C, a novel method to identify clusters in an unlabeled da...
research
04/14/2023

Phantom Embeddings: Using Embedding Space for Model Regularization in Deep Neural Networks

The strength of machine learning models stems from their ability to lear...
research
07/19/2021

OODformer: Out-Of-Distribution Detection Transformer

A serious problem in image classification is that a trained model might ...
research
04/06/2020

Class Anchor Clustering: a Distance-based Loss for Training Open Set Classifiers

Existing open set classifiers distinguish between known and unknown inpu...

Please sign up or login with your details

Forgot password? Click here to reset