Enhancing the Robustness of Prior Network in Out-of-Distribution Detection

11/18/2018 ∙ by Wenhu Chen, et al. ∙ The Regents of the University of California SAMSUNG 0

With the recent surge of interests in deep neural networks, more real-world applications start to adopt it in practice. However, deep neural networks are known to have limited control over its prediction under unseen images. Such weakness can potentially threaten society and cause annoying consequences in real-world scenarios. In order to resolve such issue, a popular task called out-of-distribution detection was proposed, which aims at separating out-of-distribution images from in-distribution images. In this paper, we propose a perturbed prior network architecture, which can efficiently separate model-level uncertainty from data-level uncertainty via prior entropy. To further enhance the robustness of proposed entropy-based uncertainty measure, we propose a concentration perturbation algorithm, which adaptively adds noise to concentration parameters so that the in- and out-of-distribution images are better separable. Our method can directly rely on the pre-trained deep neural network without re-training it, and also requires no knowledge about the network architecture and out-of-distribution examples. Such simplicity makes our method more suitable for real-world AI applications. Through comprehensive experiments, our methods demonstrate its superiority by achieving state-of-the-art results on many datasets.



There are no comments yet.


page 6

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Neural Networks have become increasingly popular than ever in addressing real-world computer vision (CV) 

[2, 10], natural language (NLP) problems [16, 1]. With the supervised strategies, deep neural networks are able to achieve human-wise performances on various tasks like image classification [2, 10], reading compression [21] and machine translation [5], etc. However, the deep neural networks are also known to have little control over its output distribution under unseen scenarios, which can potentially cause very concerning problems if applied to real-world applications. Even worse, such models are prone to adversarial attacks and raise concerns in AI safety [11, 17]

. In order to resolve such a concerning issue and propose more robust deep learning frameworks, there have been recently rising interests in studying the out-of-distribution detection problem 


, or a high-dimensional and large-scaled anomaly/novelty detection problem, which aims at separating in-distribution (IND) real images from out-of-distribution (OOD) real images.

Figure 1: An overview of current approaches: Baseline [7], ODIN [14], Mahalanobis [13], Adv-Train [12], Confidence [3], Semantic [22], DPN [15].

Such detection problem draws much attention from the community and many research studies have been published [3, 12, 13, 3, 22, 15, 14]. These methodologies are quite versatile in terms of their settings and assumptions, here we first taxonimize the out-of-distribution detection algorithms and position these previous research works in Figure 1 with respect to their complexity and knowledge. The vertical axis considers the complexity in the model level, and the horizontal axis denotes the knowledge required for training the detection algorithm:

  • Retraining: it has a new loss function and requires the neural network to be re-trained.

  • White-Box: it does not need re-training, but the detailed network architecture is known.

  • Black-Box: it does not need re-training and the network architecture is hidden.

  • OOD Val: requires out-of-distribution validation set.

  • IND Train: requires the in-distribution training set.

  • IND Val: requires in-distribution validation set.

In real-world applications, especially for large-scale and distributed AI systems, the re-training assumption is too strong and hardly practical due to the introduced complexities. The white-box assumption is stronger than black-box assumption because knowing the network architecture makes its protection easier. However, the black-box model is more practical since many real-world AI systems only provide interfaces about their output with unexposed low-level architecture. In terms of data knowledge, OOD Val is the strongest assumption in practice because it is unrealistic to pre-know the source of out-of-distribution images. IND Train is still a strong assumption in real-world applications especially with large-scale distributed AI systems. In comparison, IND Val is much more practical where only a small-scale held-out dataset is required. Overall, by increasing knowledge (horizontal) and complexity (vertical), the approaches become less and less practically useful in real-world applications due to the increasing implementation difficulty and less knowledge availability.

The Baseline [7] has the least application difficulty as it only requires the highest softmax score for representing the model’s confidence. Mahalanobis [13] assumes a white-box model and proposes to use its low-level features to compute Mahalanobis distance as evidence to train an ensemble discriminator for out-of-distribution detection. ODIN [14] though requires no knowledge about the model architecture, it uses an out-of-distribution validation set to fine-tune its hyper-parameters (temperature and perturbation magnitude). Finally, Confidence Learning [3], Semantic Label [22], Adversarial Training [12] and Deep Prior Network(DPN) [15] have the strongest assumption, they require re-training the model on the training set with newly proposed loss function.

In this paper, we propose a very practical method with application difficulty only beyond Baseline [7]. We assume the classification model is a black-box and only small-scale held-out in-domain validation data exists. Such weak assumption makes our method well suited for real-world applications. In a nutshell, our proposed method is established on Dirichlet prior network [15]

, where we first propose to degenerate it as a softmax-based neural network. Thus, we can directly rely on pre-trained neural networks to compute the predictive uncertainty measure. To further tackle such over-confidence issue of pre-trained classifier with compromised detection accuracy, we next design a concentration perturbation algorithm to enhance the robustness of the established uncertainty measure by learning an adaptive perturbation function to better separate in- and out-of-distribution images. We demonstrate our methodology in 

Figure 2, where we first add EXP operator to degenerated Dirichlet prior network as pre-trained classifier to compute the Dirichlet concentration parameters , which is fed into Perturbation Function to generate a noise . We add the noise to original prior network to increase the robustness of the established uncertainty measure . A threshold-based detector is used to tell whether an input image is from in- or out-of-distribution.

Figure 2: Illustration of our architecture.

Our main contributions are listed below:

  • We degenerate Dirichlet prior network as softmax-based classification model and directly use a pre-trained classifier to estimate the uncertainty measure.

  • We are the first to propose THE concentration perturbation algorithm and use it to greatly enhance current predictive uncertainty measure.

  • Our proposed method is able to achieve or approach state-of-the-art results across different datasets and architectures.

2 Background

Here we particularly consider the image classification problem, where corresponds to the images and corresponds to object labels. Given the training data , a Bayesian framework depicts the predictive uncertainty  111shorthand for over an unseen image as follows:


where data (aleatoric) uncertainty is described by label-level posterior and model (episdemic) uncertainty is described by model-level posterior . The integral in Equation 1 is intractable in deep neural networks, thus Monte-Carlo Sampling algorithm is used to approximate it as follows:

where each is a categorical distribution over the simplex with . This ensemble is a collection of points on the a simplex as depicted in Figure 3, which can be viewed as an implicit distribution induced by the posterior over the model parameters .

Figure 3: Illustration of Ensembles and induced implicit distribution on the simplex .

Given an ensemble from such implicit distribution, the entropy of expected distribution though indicates uncertainty in model prediction, is impossible determine from such predictive distribution whether the uncertainty is from a high class overlap or the input is far from training data space. Though [4] has proposed different measures like mutual information to determine the uncertainty source, it is very hard in practice due to the difficulty of selecting an appropriate prior distribution and the expensive computation needed for Monte-Carlo estimation in deep neural networks.

In order to explicitly separate these two sources of uncertainty, prior network [15] is proposed to explicitly parameterize model-level uncertainty with a distribution over distribution on a simplex . In [15], they propose to reformulates the predictive posterior as follows:


where the model uncertainty is collapsed into by using a point estimate . Ideally, the prior network should yield sharp distribution on the corner of simplex when the network is confident about its prediction (known-known). For noisy input data with heavy class overlap (data uncertainty), the network should yield sharp distribution in the center of the simplex (known-unknown). While for out-of-domain images, the prior network is supposed to yield flat distribution on the simplex (unknown-unknown). That is to say, the parameters

should encapsulate the knowledge about the boundary which separates in-distribution from out-of-distribution data. Such prior network is realized by Dirichlet in practice due to its tractable statistical properties and the probability density function (PDF) of Dirichlet prior distribution over all possible values of the K-dimensional categorical distribution

is written as:


where with is the concentration parameter of the Dirichlet distribution and is the normalization factor. In practice, the Dirichlet prior network is realized by a neural network function with parameters , which takes as inputs the unseen image

and then generate K-dimensional vector



In prior network [15], the entropy-based unceratinty measure is proven perfectly separate model-level uncertainty from data-level uncertainty and computationally efficient due to its closed-form solution:

where denotes the sum over all K dimensions. Please note that we refer to confidence measure as the negative value of entropy value .

3 Degenerated Prior Network

Unlike Deep Prior Network [15] which proposes a multi-task training loss function to train prior network from scratch (refer to the original paper for details), our method proposes to degenerate prior network as a softmax-based neural network to save the re-training efforts.

During training, prior network is optimized to maximize the empirical marginal likelihood on a given dataset as follows:


Recall that in widely used softmax-based neural networks, the cross-entropy objective is described as follows:


where is the last-layer output from deep neural network. It can be easily observed from the Equation 5 and Equation 6 that the Dirichlet objective function is aligned with softmax-based cross-entropy if the following holds:


where is the scale constant222For simplicity, we set the during our experiments to avoid fine-tuning the scaling hyper-parameter. . Therefore, if the exponential output of a pre-trained DNN is used as concentration parameters for prior network, then training softmax-based neural network is equivalent to training Dirichlet prior network. Therefore, we can easily obtain the predictive uncertainty measure as .

While this degenerated prior network is sufficient in some relatively simpler case, we observe compromised detection accuracy under large-scaled dataset (CIFAR100). The uncertainty measure becomes so sensitive and erratic to noises that its detection accuracy can hardly provide accurate estimation for model-level uncertainty, and the general performance is only slightly higher than Baseline [7]. Here we visualize the distribution of in- and out-of-distribution data in Figure 4 under such confidence measure. We conclude that such compromised performance is caused by the known over-fitting issue in the pre-trained neural network, where the model becomes over-confident about its prediction though it misclassifies it. More specifically, the classification model greatly emphasizes certain dimensions in the concentration parameter regardless of the form of the inputs, which causes both data sources to have indistinguishable high confidence.

Figure 4: Illustration of confidence distribution for CIFAR100 (IND) data and iSUN (OOD) data under ResNet34 [6], WideResNet [27] and ResNext [25] architectures.

4 Concentration Perturbation

In order to enhance the robustness of the established entropy-based uncertainty measure, we are inspired by fast-sign perturbation [11, 14] to design a concentration mechanism to increase the robustness of uncertainty measure. Unlike [11, 14] which requires gradient back-propagation, our proposed method does not require a backward propagation operation and better fits the black-box assumption. The other difference is that our method is on top of concentration parameters rather than raw input image . As illustrated by Figure 5, we first experiment with fast-sign perturbation algorithm as follows:


Through our experiments, such gradient-based perturbation algorithm yields trivial gain or even causes damage to detection accuracy. We conclude that the assumption made in ODIN [14] no longer holds under our scenarios since the in- and out-of-distribution concentration parameters are lying in regions of equivalent sharpness or flatness, hence the hill-climbing perturbation only yields similar impacts on both inputs, thus fails to better separate the in- from out-of-distribution images (as depicted in Figure 5 with green color, both and climb equal height in the contour). Therefore, we set out to find a more sophisticated mechanism to separate these two data sources (as depicted in rouge color of Figure 5, the out-of-distribution drops much faster than the in-distribution after adding noise ). Based on such philosophy, we design a parameterized perturbation function with parameter , which takes as input the concentration parameter to generate a noise in a way that it can widen the uncertainty difference between in- and out-of-distribution images but does not affect the model prediction.


Here, we particularly investigate the simplest linear transform based perturbation function

with denoting the learnable pertubation matrix.

Figure 5: Illustration of concentration parameters perturbation in Prior Network for in and out-of-distribution data points.

In order to obtain such perturbation matrix , we propose a discriminative loss function , which aims at enlarging the gap between in- and out-of-distribution images.


On one hand, the magnitude of such perturbation matrix is encouraged to be small so that the generated noise does not affect model’s output landscape. Therefore we enforce the first constrain on the norm of perturbed noise: with denoting the maximum allowed perturbation ratio. On the other hand, the perturbed concentration should still lie in the support space , hence we enforce the second constraint on the positivity of noise : . Since it’s impractical to assume we have access to out-of-distribution (OOD) dataset, we propose to use adversarial examples generated by FSGM [11] as the synthesized OOD examples. Thus, the optimal perturbation matrix is described as follows:


Here we propose to optimize by gradient ascent algorithm. The first constraint is realized by re-scaling the noise whose norm is larger than

while the second constraint is realized by simply adding a ReLU 

[18] activation to the perturbation weight .

5 Experiments

We follow the previous papers [7, 14] to replicate their experimental setups. For each sample fed into the neural network, we will calculate the uncertainty measure based on the output concentration

, which will be used to predict which distribution the samples come from. Finally, several different evaluation metrics are used to measure and compare how well different detection methods can separate the two distributions.

Data Source Dataset Content (classes) #Train #Test
In-Distribution CIFAR-10 [9] 10 classes: Airplane, Truck, Bird, etc. 60,000 10,000
CIFAR-100 [9] 100 classes: Mammals,  Fish, Flower, etc 60,000 10,000
Out-of-Distribution iSUN [24] 908 clases: Airport, Abby, etc - 8,925
LSUN [26] 10 classes: Bedrooms, Churches, etc - 10,000


1000 classes: Plant, Natural object, Sports, etc - 10,000
SVHN [19] 10 classes: The Street View House Numbers - 26,032
Table 1: Overvie of in- and out-of-distribution datasets

5.1 Datasets and Implementation

Here we list all the datasets used in Table 1, which are available in github333https://github.com/ShiyuLiang/odin-pytorch. In order to make fair comparisons with previous out-of-distribution detection algorithms, we replicate the same setting as [7]. For both CIFAR10 and CIFAR100 dataset, we pre-train VGG13 [23], ResNet18 [6], ResNet34 [6], WideResnet [27] (depth=28, widening factor=10), ResNext [25] (depth=29, widening factor=8) with publicly available code444https://github.com/bearpaw/pytorch-classification, and then use the converged model as the black-box. We adopt the publicly available implementation in github555https://github.com/1Konny/FGSM to generate FGSM [11] examples. For concentration perturbation matrix , we initialize all the weights to zero and then optimize it via Adam optimizer [8] with lr=1e-3 and weight-decay=5e-4. We experimented with different setups of hyper-parameter and found that setting

can yield generally promising results. Our method is implemented with Pytorch 

[20] based on publicly available, all the code and trained models will be released in github 666https://github.com/wenhuchen/.

5.2 Experimental results

We measure the quality of out-of-distribution detection using the established metrics for this task [7].

  • FPR at 95% TPR (lower is better): Measures the false positive rate (FPR) when the true positive rate (TPR) is equal to 95%. Note that TNR = 1 - FPR.

  • Detection Error (lower is better): Measures the minimum possible misclassification probability defined by . Note that Detection Accuracy = 1 - Detection Error.

  • AUROC (larger is better): Measures the Area Under the Receiver Operating Characteristic curve. The Receiver Operating Characteristic (ROC) curve plots the relationship between TPR and FPR.

  • AUPR (larger is better): Measures the Area Under the Precision-Recall (PR) curve, where AUPR-In refers to using in-distribution as positive class and AUPR-Out refers to using out-of-distribution as positive class.

IND/OOD Model Method FPR@ Detection Error AUROC
CIFAR10/ iSUN VGG13 Baseline 43.8 11.4 94
ODIN 22.4 10.2 95.8
Confidence 16.3 8.5 97.5
Semantic 23.2 10.2 96.4
Ours 10.7 7.4 97.7
CIFAR10/ LSUN VGG13 Baseline 41.9 11.5 94
ODIN 20.2 9.8 95.9
Confidence 16.4 8.3 97.5
Semantic 22.9 13.9 96.0
Ours 10.3 7.4 97.8
CIFAR10/ Tiny-ImgNet VGG13 Baseline 43.8 12 93.5
ODIN 24.3 11.3 95.7
Confidence 18.4 9.4 97
Semantic 19.8 10.1 96.5
Ours 13.8 7.9 97.5
Table 2: Experimental Results on VGG13 [23] architecture, where Confidence refers to [3] and Semantic refers to [22], most results are copied from original paper.
CIFAR10/ Tiny-ImgNet ResNet18 Baseline 59.0 15.1 91.1
ODIN 32.1 11.2 94.9
DPN - - 93.0
Mahalanobis 2.9 0.6 96.3
Ours 17.1 8.7 96.8
CIFAR10/ LSUN ResNet18 Baseline 50.2 12.3 93.1
ODIN 17.9 8.4 96.9
DPN - - 90.2
Mahalanobis 1.2 0.3 97.5
Ours 7.7 5.9 98.3
CIFAR10/ SVHN ResNet18 Baseline 49.5 13.3 92.0
ODIN 29.7 15.1 91.7
DPN - - 95.9
Mahalanobis 12.2 2.3 92.6
Ours 28.7 13.6 93.2
Table 3: Experimental Results on ResNet18 [6] architecture, where Mahalanobis refers to [13] and DPN refers to [15], most results are copied from the original paper.

CIFAR10 experiments

Here we first demonstrate our experimental results on CIFAR10 datasets, which use pre-trained neural networks on CIFAR10 dataset to detect whether unseen image inputs are from in-distribution. We experiment with two neural network architectures, VGG13 [23] (see Table 2) and ResNet18 [6] (see Table 3). In Table 2, we mainly compare against Baseline [7], ODIN [14], Confidence [3] and Semantic [22] under the VGG13 architecture. We can easily observe that our proposed method can significantly outperform these competing algorithms across all metrics. In Table 3, we mainly compare against ODIN [14], DPN [15] and Mahalanobis [13] under ResNet18 architecture. We can observe that Mahalanobis algorithm performs extremely well on FPR(TRP=95%) and detection error metrics, but our method is more superior in terms of the AUROC metric.

CIFAR-100/ iSUN ResNet34 ODIN 61.3 23.7 83.6
Semantic 58.4 21.4 85.2
Mahalanobis 18.7 11.6 94.1
Ours 19.8 12.2 94.2
CIFAR-100/ LSUN ResNet34 ODIN 76.8 42.4 78.9
Semantic 79.5 42.2 79.0
Mahalanobis 14.9 9.0 95.4
Ours 14.5 9.6 95.9
CIFAR-100/ Tiny-ImgNet ResNet34 ODIN 63.9 25.2 82.3
Semantic 62.4 24.4 83.1
Mahalanobis 12.0 9.1 96.5
Ours 16.3 10.3 95.3
Table 4: Experimental results on ResNet34 architecture on CIFAR100 dataset, where Semantic refers to [22] and Mahalanobis refers to [13].
Model OOD Method FPR@TPR95 Detection Error AUROC AUPR  In AUPR Out
WideResNet CIFAR-100 iSUN Base/ODIN/Ours 82.7/57.3/18.0 43.9/31.1/11.1 72.8/86.6/95.5 74.2/85.9/95.5 69.2/84.9/95.6
LSUN Base/ODIN/Ours 82.2/56.5/13.3 43.6/30.8/8.8 73.9/86.0/97.1 75.7/86.2/97.1 69.2/84.9/97.2
TinyImgNet Base/ODIN/Ours 79.2/55.9/16.8 42.1/30.4/10.2 72.2/84.0/96.2 70.4/82.8/95.7 70.8/84.4/96.4
ResNeXt-29 CIFAR-100 iSUN Base/ODIN/Ours 82.2/61.6/18.4 31.0/21.4/11.2 74.5/86.4/94.9 79.8/89.1/95.3 67.7/82.7/94.0
LSUN Base/ODIN/Ours 82.2/62.4/13.6 31.8/22.1/8.7 73.6/85.9/96.5 77.4/87.8/96.8 69.5/83.9/95.8
TinyImgNet Base/ODIN/Ours 79.6/60.2/16.9 31.0/21.5/9.9 75.1/86.5/96.5 78.4/88.2/96.8 71.6/84.8/95.8
Table 5: More experimental results on CIFAR100 dataset on WideResNet and ResNeXt architeture.
Figure 6: Ablation study of the impact of concentration perturbation algorithm on CIFAR10/CIFAR100 dataset.

CIFAR100 experiments

Here we experiment with large-scaled CIFAR100 dataset to further investigate the effectiveness of our proposed algorithm. In Table 4, we mainly compare against ODIN [14], Semantic [22], Mahalanobis [13]. under the ResNet34 architecture. We can observe very similar trends as Table 3 where both our method and Mahalanobis are significantly outperforming our competing algorithms, though Mahalanobis method achieves very surprising FPR and detection error scores, it is lagged behind us in terms of AUROC measure. We also provide more experimental results in Table 5, where we can observe consistently promising empirical results across different model architectures. Our proposed methodology is very simple and easy to implement, yet very effective in defending against out-of-distribution examples.

The advantage of our method agasint DPN [15] lies in its full exploitation of the pre-trained neural network, which makes our model much better fitted for large-scale real-world applications. Unlike Mahalanobis [13] which needs to handcraft a huge sum of low-level ensemble features, our method only need the last-layer output from the black-box model, which saves feature engineering efforts and achieves almost equivalent performance on different datasets. Here we visualize the perturbation ratio distribution in Figure 7 and observe that the ratio is mainly focused in very small values , which confirms our intuition to design mild perturbation without violating model output landscape.

Figure 7: Distribution of perturbation norm ratio on ResNet [6] and WideResNet [25] architecture.

5.3 Ablation Study

In this section, we are particularly interested in understanding the impact of our concentration perturbation algorithm on final OOD detection metrics. Here we first visualize our results for CIFAR10 and CIFAR100 in Figure 6. From these two diagrams, we could observe a very significant increase across different metrics and network architectures, especially on TNR (TPR=95%) and Detection Accuracy. The other trend we observe is that our perturbation seems to seems to yield lesser improvement on CIFAR10 than CIFAR100 dataset, which reflects our assumption that the quality of Dirichlet uncertainty is highly related to the classification accuracy, or say, when the pre-trained model has very weak classification capability, such uncertainty measure is highly inaccurate and very prone to misclassified examples. Another interesting observation is that: the uncertainty measure is sensitive to model architecture, especially the scale of the output layer. In VGG13 [23] architecture (see  Figure 6), such uncertainty measure is able to yield very promising OOD detection accuracy without concentration perturbation. While ResNet18 [6] though is able to achieve better accuracy in CIFAR10, its out-of-distribution detection accuracy is much lower than VGG13.

Figure 8: Visualization of trained perturbation matrix under different neural networks in CIFAR100 experiments.
Figure 9: Visualization of concentration parameters and perturbed concentration parameters .

5.4 Impact of Concentration Perturbation

In this section, we are interested in studying the linear perturbation matrix to understand its essence. First of all, we visualize the matrix in Figure 8. As can be seen, the diagonal elements are overwhelming the non-diagonal elements in terms of magnitude due to our norm control on the perturbation noise. Here, we visualize the perturbed concentration in Figure 9, from which we can see that the before-perturbed concentration is rather sharp over some dimensions (classes), which reflects the known over-confident issue in pre-trained neural networks. By adding perturbation noise, the whole spectrum becomes much noiser than before but the highlighted specks remain unchanged. The insight behind such perturbation noise is to decrease model’s unreasonably high confidence into a more rational range by slightly increasing uncertainties into model’s prediction. Then, we compare the confidence shift caused by confidence perturbation and fast-sign perturbation [14] in Figure 10, from which we find that our perturbation noise can remarkably increase the confidence in in-distribution images while reducing the confidence in out-of-distribution examples, which greatly helps separate in-distribution from out-of-distribution examples. In comparison, the fast-sign perturbation increases confidence measure for both in- and out-of-distribution equivalently, which fails to helps separate these two sources images sources.

Figure 10: The confidence distribution shift after applying concentration perturbation and fast-sign perturbation for IND and OOD images under ResNet [6] architecture.

Besides, we also visualize the discriminative training process in Figure 11, we observe that the training loss approximated by synthesized out-domain data is quite aligned with the detection metrics computed on real out-domain data. We conclude that the synthesized adversarial examples [11] have good generalization ability, which lies the foundation for our methodology.

Figure 11: The discriminative objective curve and FPR curve on OOD detection (CIFAR100/iSUN).

6 Conclusion

In this paper, we aim at designing a simple yet effective out-of-distribution detection algorithm to increase the robustness of existing neural networks. Our method though requires the least knowledge and introduces minor complexity during training, yields very significant performance especially on the large-scale dataset. However, our method is sensitive to different neural architectures, which could sometimes lead to inferior performance. In the future work, we plan to study more about the architectural differences and investigate their causes. Besides, the generalizing ability of adversarial examples is the cornerstone of our method, it’s interesting to see how the adversarial examples generated by different algorithms influence the detection accuracy and why these differences happen.