Neural Networks have become increasingly popular than ever in addressing real-world computer vision (CV)[2, 10], natural language (NLP) problems [16, 1]. With the supervised strategies, deep neural networks are able to achieve human-wise performances on various tasks like image classification [2, 10], reading compression  and machine translation , etc. However, the deep neural networks are also known to have little control over its output distribution under unseen scenarios, which can potentially cause very concerning problems if applied to real-world applications. Even worse, such models are prone to adversarial attacks and raise concerns in AI safety [11, 17]
. In order to resolve such a concerning issue and propose more robust deep learning frameworks, there have been recently rising interests in studying the out-of-distribution detection problem
, or a high-dimensional and large-scaled anomaly/novelty detection problem, which aims at separating in-distribution (IND) real images from out-of-distribution (OOD) real images.
Such detection problem draws much attention from the community and many research studies have been published [3, 12, 13, 3, 22, 15, 14]. These methodologies are quite versatile in terms of their settings and assumptions, here we first taxonimize the out-of-distribution detection algorithms and position these previous research works in Figure 1 with respect to their complexity and knowledge. The vertical axis considers the complexity in the model level, and the horizontal axis denotes the knowledge required for training the detection algorithm:
Retraining: it has a new loss function and requires the neural network to be re-trained.
White-Box: it does not need re-training, but the detailed network architecture is known.
Black-Box: it does not need re-training and the network architecture is hidden.
OOD Val: requires out-of-distribution validation set.
IND Train: requires the in-distribution training set.
IND Val: requires in-distribution validation set.
In real-world applications, especially for large-scale and distributed AI systems, the re-training assumption is too strong and hardly practical due to the introduced complexities. The white-box assumption is stronger than black-box assumption because knowing the network architecture makes its protection easier. However, the black-box model is more practical since many real-world AI systems only provide interfaces about their output with unexposed low-level architecture. In terms of data knowledge, OOD Val is the strongest assumption in practice because it is unrealistic to pre-know the source of out-of-distribution images. IND Train is still a strong assumption in real-world applications especially with large-scale distributed AI systems. In comparison, IND Val is much more practical where only a small-scale held-out dataset is required. Overall, by increasing knowledge (horizontal) and complexity (vertical), the approaches become less and less practically useful in real-world applications due to the increasing implementation difficulty and less knowledge availability.
The Baseline  has the least application difficulty as it only requires the highest softmax score for representing the model’s confidence. Mahalanobis  assumes a white-box model and proposes to use its low-level features to compute Mahalanobis distance as evidence to train an ensemble discriminator for out-of-distribution detection. ODIN  though requires no knowledge about the model architecture, it uses an out-of-distribution validation set to fine-tune its hyper-parameters (temperature and perturbation magnitude). Finally, Confidence Learning , Semantic Label , Adversarial Training  and Deep Prior Network(DPN)  have the strongest assumption, they require re-training the model on the training set with newly proposed loss function.
In this paper, we propose a very practical method with application difficulty only beyond Baseline . We assume the classification model is a black-box and only small-scale held-out in-domain validation data exists. Such weak assumption makes our method well suited for real-world applications. In a nutshell, our proposed method is established on Dirichlet prior network 
, where we first propose to degenerate it as a softmax-based neural network. Thus, we can directly rely on pre-trained neural networks to compute the predictive uncertainty measure. To further tackle such over-confidence issue of pre-trained classifier with compromised detection accuracy, we next design a concentration perturbation algorithm to enhance the robustness of the established uncertainty measure by learning an adaptive perturbation function to better separate in- and out-of-distribution images. We demonstrate our methodology inFigure 2, where we first add EXP operator to degenerated Dirichlet prior network as pre-trained classifier to compute the Dirichlet concentration parameters , which is fed into Perturbation Function to generate a noise . We add the noise to original prior network to increase the robustness of the established uncertainty measure . A threshold-based detector is used to tell whether an input image is from in- or out-of-distribution.
Our main contributions are listed below:
We degenerate Dirichlet prior network as softmax-based classification model and directly use a pre-trained classifier to estimate the uncertainty measure.
We are the first to propose THE concentration perturbation algorithm and use it to greatly enhance current predictive uncertainty measure.
Our proposed method is able to achieve or approach state-of-the-art results across different datasets and architectures.
Here we particularly consider the image classification problem, where corresponds to the images and corresponds to object labels. Given the training data , a Bayesian framework depicts the predictive uncertainty 111shorthand for over an unseen image as follows:
where data (aleatoric) uncertainty is described by label-level posterior and model (episdemic) uncertainty is described by model-level posterior . The integral in Equation 1 is intractable in deep neural networks, thus Monte-Carlo Sampling algorithm is used to approximate it as follows:
where each is a categorical distribution over the simplex with . This ensemble is a collection of points on the a simplex as depicted in Figure 3, which can be viewed as an implicit distribution induced by the posterior over the model parameters .
Given an ensemble from such implicit distribution, the entropy of expected distribution though indicates uncertainty in model prediction, is impossible determine from such predictive distribution whether the uncertainty is from a high class overlap or the input is far from training data space. Though  has proposed different measures like mutual information to determine the uncertainty source, it is very hard in practice due to the difficulty of selecting an appropriate prior distribution and the expensive computation needed for Monte-Carlo estimation in deep neural networks.
In order to explicitly separate these two sources of uncertainty, prior network  is proposed to explicitly parameterize model-level uncertainty with a distribution over distribution on a simplex . In , they propose to reformulates the predictive posterior as follows:
where the model uncertainty is collapsed into by using a point estimate . Ideally, the prior network should yield sharp distribution on the corner of simplex when the network is confident about its prediction (known-known). For noisy input data with heavy class overlap (data uncertainty), the network should yield sharp distribution in the center of the simplex (known-unknown). While for out-of-domain images, the prior network is supposed to yield flat distribution on the simplex (unknown-unknown). That is to say, the parameters
should encapsulate the knowledge about the boundary which separates in-distribution from out-of-distribution data. Such prior network is realized by Dirichlet in practice due to its tractable statistical properties and the probability density function (PDF) of Dirichlet prior distribution over all possible values of the K-dimensional categorical distributionis written as:
where with is the concentration parameter of the Dirichlet distribution and is the normalization factor. In practice, the Dirichlet prior network is realized by a neural network function with parameters , which takes as inputs the unseen image
and then generate K-dimensional vector:
In prior network , the entropy-based unceratinty measure is proven perfectly separate model-level uncertainty from data-level uncertainty and computationally efficient due to its closed-form solution:
where denotes the sum over all K dimensions. Please note that we refer to confidence measure as the negative value of entropy value .
3 Degenerated Prior Network
Unlike Deep Prior Network  which proposes a multi-task training loss function to train prior network from scratch (refer to the original paper for details), our method proposes to degenerate prior network as a softmax-based neural network to save the re-training efforts.
During training, prior network is optimized to maximize the empirical marginal likelihood on a given dataset as follows:
Recall that in widely used softmax-based neural networks, the cross-entropy objective is described as follows:
where is the last-layer output from deep neural network. It can be easily observed from the Equation 5 and Equation 6 that the Dirichlet objective function is aligned with softmax-based cross-entropy if the following holds:
where is the scale constant222For simplicity, we set the during our experiments to avoid fine-tuning the scaling hyper-parameter. . Therefore, if the exponential output of a pre-trained DNN is used as concentration parameters for prior network, then training softmax-based neural network is equivalent to training Dirichlet prior network. Therefore, we can easily obtain the predictive uncertainty measure as .
While this degenerated prior network is sufficient in some relatively simpler case, we observe compromised detection accuracy under large-scaled dataset (CIFAR100). The uncertainty measure becomes so sensitive and erratic to noises that its detection accuracy can hardly provide accurate estimation for model-level uncertainty, and the general performance is only slightly higher than Baseline . Here we visualize the distribution of in- and out-of-distribution data in Figure 4 under such confidence measure. We conclude that such compromised performance is caused by the known over-fitting issue in the pre-trained neural network, where the model becomes over-confident about its prediction though it misclassifies it. More specifically, the classification model greatly emphasizes certain dimensions in the concentration parameter regardless of the form of the inputs, which causes both data sources to have indistinguishable high confidence.
4 Concentration Perturbation
In order to enhance the robustness of the established entropy-based uncertainty measure, we are inspired by fast-sign perturbation [11, 14] to design a concentration mechanism to increase the robustness of uncertainty measure. Unlike [11, 14] which requires gradient back-propagation, our proposed method does not require a backward propagation operation and better fits the black-box assumption. The other difference is that our method is on top of concentration parameters rather than raw input image . As illustrated by Figure 5, we first experiment with fast-sign perturbation algorithm as follows:
Through our experiments, such gradient-based perturbation algorithm yields trivial gain or even causes damage to detection accuracy. We conclude that the assumption made in ODIN  no longer holds under our scenarios since the in- and out-of-distribution concentration parameters are lying in regions of equivalent sharpness or flatness, hence the hill-climbing perturbation only yields similar impacts on both inputs, thus fails to better separate the in- from out-of-distribution images (as depicted in Figure 5 with green color, both and climb equal height in the contour). Therefore, we set out to find a more sophisticated mechanism to separate these two data sources (as depicted in rouge color of Figure 5, the out-of-distribution drops much faster than the in-distribution after adding noise ). Based on such philosophy, we design a parameterized perturbation function with parameter , which takes as input the concentration parameter to generate a noise in a way that it can widen the uncertainty difference between in- and out-of-distribution images but does not affect the model prediction.
Here, we particularly investigate the simplest linear transform based perturbation functionwith denoting the learnable pertubation matrix.
In order to obtain such perturbation matrix , we propose a discriminative loss function , which aims at enlarging the gap between in- and out-of-distribution images.
On one hand, the magnitude of such perturbation matrix is encouraged to be small so that the generated noise does not affect model’s output landscape. Therefore we enforce the first constrain on the norm of perturbed noise: with denoting the maximum allowed perturbation ratio. On the other hand, the perturbed concentration should still lie in the support space , hence we enforce the second constraint on the positivity of noise : . Since it’s impractical to assume we have access to out-of-distribution (OOD) dataset, we propose to use adversarial examples generated by FSGM  as the synthesized OOD examples. Thus, the optimal perturbation matrix is described as follows:
Here we propose to optimize by gradient ascent algorithm. The first constraint is realized by re-scaling the noise whose norm is larger than
while the second constraint is realized by simply adding a ReLU activation to the perturbation weight .
, which will be used to predict which distribution the samples come from. Finally, several different evaluation metrics are used to measure and compare how well different detection methods can separate the two distributions.
|Data Source||Dataset||Content (classes)||#Train||#Test|
|In-Distribution||CIFAR-10 ||10 classes: Airplane, Truck, Bird, etc.||60,000||10,000|
|CIFAR-100 ||100 classes: Mammals, Fish, Flower, etc||60,000||10,000|
|Out-of-Distribution||iSUN ||908 clases: Airport, Abby, etc||-||8,925|
|LSUN ||10 classes: Bedrooms, Churches, etc||-||10,000|
|1000 classes: Plant, Natural object, Sports, etc||-||10,000|
|SVHN ||10 classes: The Street View House Numbers||-||26,032|
5.1 Datasets and Implementation
Here we list all the datasets used in Table 1, which are available in github333https://github.com/ShiyuLiang/odin-pytorch. In order to make fair comparisons with previous out-of-distribution detection algorithms, we replicate the same setting as . For both CIFAR10 and CIFAR100 dataset, we pre-train VGG13 , ResNet18 , ResNet34 , WideResnet  (depth=28, widening factor=10), ResNext  (depth=29, widening factor=8) with publicly available code444https://github.com/bearpaw/pytorch-classification, and then use the converged model as the black-box. We adopt the publicly available implementation in github555https://github.com/1Konny/FGSM to generate FGSM  examples. For concentration perturbation matrix , we initialize all the weights to zero and then optimize it via Adam optimizer  with lr=1e-3 and weight-decay=5e-4. We experimented with different setups of hyper-parameter and found that setting
can yield generally promising results. Our method is implemented with Pytorch based on publicly available, all the code and trained models will be released in github 666https://github.com/wenhuchen/.
5.2 Experimental results
We measure the quality of out-of-distribution detection using the established metrics for this task .
FPR at 95% TPR (lower is better): Measures the false positive rate (FPR) when the true positive rate (TPR) is equal to 95%. Note that TNR = 1 - FPR.
Detection Error (lower is better): Measures the minimum possible misclassification probability defined by . Note that Detection Accuracy = 1 - Detection Error.
AUROC (larger is better): Measures the Area Under the Receiver Operating Characteristic curve. The Receiver Operating Characteristic (ROC) curve plots the relationship between TPR and FPR.
AUPR (larger is better): Measures the Area Under the Precision-Recall (PR) curve, where AUPR-In refers to using in-distribution as positive class and AUPR-Out refers to using out-of-distribution as positive class.
|IND/OOD Model||Method||FPR@||Detection Error||AUROC|
|CIFAR10/ iSUN VGG13||Baseline||43.8||11.4||94|
|CIFAR10/ LSUN VGG13||Baseline||41.9||11.5||94|
|CIFAR10/ Tiny-ImgNet VGG13||Baseline||43.8||12||93.5|
|CIFAR10/ Tiny-ImgNet ResNet18||Baseline||59.0||15.1||91.1|
|CIFAR10/ LSUN ResNet18||Baseline||50.2||12.3||93.1|
|CIFAR10/ SVHN ResNet18||Baseline||49.5||13.3||92.0|
Here we first demonstrate our experimental results on CIFAR10 datasets, which use pre-trained neural networks on CIFAR10 dataset to detect whether unseen image inputs are from in-distribution. We experiment with two neural network architectures, VGG13  (see Table 2) and ResNet18  (see Table 3). In Table 2, we mainly compare against Baseline , ODIN , Confidence  and Semantic  under the VGG13 architecture. We can easily observe that our proposed method can significantly outperform these competing algorithms across all metrics. In Table 3, we mainly compare against ODIN , DPN  and Mahalanobis  under ResNet18 architecture. We can observe that Mahalanobis algorithm performs extremely well on FPR(TRP=95%) and detection error metrics, but our method is more superior in terms of the AUROC metric.
|CIFAR-100/ iSUN ResNet34||ODIN||61.3||23.7||83.6|
|CIFAR-100/ LSUN ResNet34||ODIN||76.8||42.4||78.9|
|CIFAR-100/ Tiny-ImgNet ResNet34||ODIN||63.9||25.2||82.3|
|Model||OOD||Method||FPR@TPR95||Detection Error||AUROC||AUPR In||AUPR Out|
Here we experiment with large-scaled CIFAR100 dataset to further investigate the effectiveness of our proposed algorithm. In Table 4, we mainly compare against ODIN , Semantic , Mahalanobis . under the ResNet34 architecture. We can observe very similar trends as Table 3 where both our method and Mahalanobis are significantly outperforming our competing algorithms, though Mahalanobis method achieves very surprising FPR and detection error scores, it is lagged behind us in terms of AUROC measure. We also provide more experimental results in Table 5, where we can observe consistently promising empirical results across different model architectures. Our proposed methodology is very simple and easy to implement, yet very effective in defending against out-of-distribution examples.
The advantage of our method agasint DPN  lies in its full exploitation of the pre-trained neural network, which makes our model much better fitted for large-scale real-world applications. Unlike Mahalanobis  which needs to handcraft a huge sum of low-level ensemble features, our method only need the last-layer output from the black-box model, which saves feature engineering efforts and achieves almost equivalent performance on different datasets. Here we visualize the perturbation ratio distribution in Figure 7 and observe that the ratio is mainly focused in very small values , which confirms our intuition to design mild perturbation without violating model output landscape.
5.3 Ablation Study
In this section, we are particularly interested in understanding the impact of our concentration perturbation algorithm on final OOD detection metrics. Here we first visualize our results for CIFAR10 and CIFAR100 in Figure 6. From these two diagrams, we could observe a very significant increase across different metrics and network architectures, especially on TNR (TPR=95%) and Detection Accuracy. The other trend we observe is that our perturbation seems to seems to yield lesser improvement on CIFAR10 than CIFAR100 dataset, which reflects our assumption that the quality of Dirichlet uncertainty is highly related to the classification accuracy, or say, when the pre-trained model has very weak classification capability, such uncertainty measure is highly inaccurate and very prone to misclassified examples. Another interesting observation is that: the uncertainty measure is sensitive to model architecture, especially the scale of the output layer. In VGG13  architecture (see Figure 6), such uncertainty measure is able to yield very promising OOD detection accuracy without concentration perturbation. While ResNet18  though is able to achieve better accuracy in CIFAR10, its out-of-distribution detection accuracy is much lower than VGG13.
5.4 Impact of Concentration Perturbation
In this section, we are interested in studying the linear perturbation matrix to understand its essence. First of all, we visualize the matrix in Figure 8. As can be seen, the diagonal elements are overwhelming the non-diagonal elements in terms of magnitude due to our norm control on the perturbation noise. Here, we visualize the perturbed concentration in Figure 9, from which we can see that the before-perturbed concentration is rather sharp over some dimensions (classes), which reflects the known over-confident issue in pre-trained neural networks. By adding perturbation noise, the whole spectrum becomes much noiser than before but the highlighted specks remain unchanged. The insight behind such perturbation noise is to decrease model’s unreasonably high confidence into a more rational range by slightly increasing uncertainties into model’s prediction. Then, we compare the confidence shift caused by confidence perturbation and fast-sign perturbation  in Figure 10, from which we find that our perturbation noise can remarkably increase the confidence in in-distribution images while reducing the confidence in out-of-distribution examples, which greatly helps separate in-distribution from out-of-distribution examples. In comparison, the fast-sign perturbation increases confidence measure for both in- and out-of-distribution equivalently, which fails to helps separate these two sources images sources.
Besides, we also visualize the discriminative training process in Figure 11, we observe that the training loss approximated by synthesized out-domain data is quite aligned with the detection metrics computed on real out-domain data. We conclude that the synthesized adversarial examples  have good generalization ability, which lies the foundation for our methodology.
In this paper, we aim at designing a simple yet effective out-of-distribution detection algorithm to increase the robustness of existing neural networks. Our method though requires the least knowledge and introduces minor complexity during training, yields very significant performance especially on the large-scale dataset. However, our method is sensitive to different neural architectures, which could sometimes lead to inferior performance. In the future work, we plan to study more about the architectural differences and investigate their causes. Besides, the generalizing ability of adversarial examples is the cornerstone of our method, it’s interesting to see how the adversarial examples generated by different algorithms influence the detection accuracy and why these differences happen.
Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin.
A neural probabilistic language model.
Journal of machine learning research, 3(Feb):1137–1155, 2003.
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei.
Imagenet: A large-scale hierarchical image database.
Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 248–255. Ieee, 2009.
-  T. DeVries and G. W. Taylor. Learning confidence for out-of-distribution detection in neural networks. arXiv preprint arXiv:1802.04865, 2018.
-  Y. Gal. Uncertainty in deep learning. University of Cambridge, 2016.
-  H. Hassan, A. Aue, C. Chen, V. Chowdhary, J. Clark, C. Federmann, X. Huang, M. Junczys-Dowmunt, W. Lewis, M. Li, et al. Achieving human parity on automatic chinese to english news translation. arXiv preprint arXiv:1803.05567, 2018.
-  K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
-  D. Hendrycks and K. Gimpel. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136, 2016.
-  D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
-  A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
A. Krizhevsky, I. Sutskever, and G. E. Hinton.
Imagenet classification with deep convolutional neural networks.In Advances in neural information processing systems, pages 1097–1105, 2012.
-  A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533, 2016.
-  K. Lee, H. Lee, K. Lee, and J. Shin. Training confidence-calibrated classifiers for detecting out-of-distribution samples. arXiv preprint arXiv:1711.09325, 2017.
-  K. Lee, K. Lee, H. Lee, and J. Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. arXiv preprint arXiv:1807.03888, 2018.
-  S. Liang, Y. Li, and R. Srikant. Enhancing the reliability of out-of-distribution image detection in neural networks. arXiv preprint arXiv:1706.02690, 2017.
-  A. Malinin and M. Gales. Predictive uncertainty estimation via prior networks. arXiv preprint arXiv:1802.10501, 2018.
-  T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
-  S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2574–2582, 2016.
-  V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010.
-  Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng. Reading digits in natural images with unsupervised feature learning. In NIPS workshop on deep learning and unsupervised feature learning, volume 2011, page 5, 2011.
-  A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic differentiation in pytorch. 2017.
-  P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250, 2016.
-  G. Shalev, Y. Adi, and J. Keshet. Out-of-distribution detection using multiple semantic label representations. arXiv preprint arXiv:1808.06664, 2018.
-  K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba.
Sun database: Large-scale scene recognition from abbey to zoo.In Computer vision and pattern recognition (CVPR), 2010 IEEE conference on, pages 3485–3492. IEEE, 2010.
-  S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He. Aggregated residual transformations for deep neural networks. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pages 5987–5995. IEEE, 2017.
-  F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser, and J. Xiao. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.
-  S. Zagoruyko and N. Komodakis. Wide residual networks. In Proceedings of the British Machine Vision Conference 2016, BMVC 2016, York, UK, September 19-22, 2016, 2016.