Detecting Anomalous Inputs to DNN Classifiers By Joint Statistical Testing at the Layers

by   Jayaram Raghuram, et al.

Detecting anomalous inputs, such as adversarial and out-of-distribution (OOD) inputs, is critical for classifiers deployed in real-world applications, especially deep neural network (DNN) classifiers that are known to be brittle on such inputs. We propose an unsupervised statistical testing framework for detecting such anomalous inputs to a trained DNN classifier based on its internal layer representations. By calculating test statistics at the input and intermediate-layer representations of the DNN, conditioned individually on the predicted class and on the true class of labeled training data, the method characterizes their class-conditional distributions on natural inputs. Given a test input, its extent of non-conformity with respect to the training distribution is captured using p-values of the class-conditional test statistics across the layers, which are then combined using a scoring function designed to score high on anomalous inputs. We focus on adversarial inputs, which are an important class of anomalous inputs, and also demonstrate the effectiveness of our method on general OOD inputs. The proposed framework also provides an alternative class prediction that can be used to correct the DNNs prediction on (detected) adversarial inputs. Experiments on well-known image classification datasets with strong adversarial attacks, including a custom attack method that uses the internal layer representations of the DNN, demonstrate that our method outperforms or performs comparably with five state-of-the-art detection methods.


page 1

page 2

page 3

page 4


Anomaly Detection of Test-Time Evasion Attacks using Class-conditional Generative Adversarial Networks

Deep Neural Networks (DNNs) have been shown vulnerable to adversarial (T...

DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows

Despite much recent work, detecting out-of-distribution (OOD) inputs and...

MOOD: Multi-level Out-of-distribution Detection

Out-of-distribution (OOD) detection is essential to prevent anomalous in...

Testing Deep Neural Network based Image Classifiers

Image classification is an important task in today's world with many app...

Towards Explanation of DNN-based Prediction with Guided Feature Inversion

While deep neural networks (DNN) have become an effective computational ...

p-DkNN: Out-of-Distribution Detection Through Statistical Testing of Deep Representations

The lack of well-calibrated confidence estimates makes neural networks i...

Code Repositories


Code and experiments for the adversarial detection paper

view repo