Are Out-of-Distribution Detection Methods Effective on Large-Scale Datasets?

10/30/2019
by   Ryne Roady, et al.
19

Supervised classification methods often assume the train and test data distributions are the same and that all classes in the test set are present in the training set. However, deployed classifiers often require the ability to recognize inputs from outside the training set as unknowns. This problem has been studied under multiple paradigms including out-of-distribution detection and open set recognition. For convolutional neural networks, there have been two major approaches: 1) inference methods to separate knowns from unknowns and 2) feature space regularization strategies to improve model robustness to outlier inputs. There has been little effort to explore the relationship between the two approaches and directly compare performance on anything other than small-scale datasets that have at most 100 categories. Using ImageNet-1K and Places-434, we identify novel combinations of regularization and specialized inference methods that perform best across multiple outlier detection problems of increasing difficulty level. We found that input perturbation and temperature scaling yield the best performance on large scale datasets regardless of the feature space regularization strategy. Improving the feature space by regularizing against a background class can be helpful if an appropriate background class can be found, but this is impractical for large scale image classification datasets.

READ FULL TEXT

page 1

page 5

research
09/10/2020

Improved Robustness to Open Set Inputs via Tempered Mixup

Supervised classification methods often assume that evaluation data is d...
research
06/20/2022

Breaking Down Out-of-Distribution Detection: Many Methods Based on OOD Training Data Estimate a Combination of the Same Core Quantities

It is an important problem in trustworthy machine learning to recognize ...
research
08/24/2023

LORD: Leveraging Open-Set Recognition with Unknown Data

Handling entirely unknown data is a challenge for any deployed classifie...
research
07/21/2022

Towards Accurate Open-Set Recognition via Background-Class Regularization

In open-set recognition (OSR), classifiers should be able to reject unkn...
research
09/20/2018

Distribution Networks for Open Set Learning

In open set learning, a model must be able to generalize to novel classe...
research
11/16/2021

Which CNNs and Training Settings to Choose for Action Unit Detection? A Study Based on a Large-Scale Dataset

In this paper we explore the influence of some frequently used Convoluti...
research
01/30/2023

Massively Scaling Heteroscedastic Classifiers

Heteroscedastic classifiers, which learn a multivariate Gaussian distrib...

Please sign up or login with your details

Forgot password? Click here to reset