On the (Statistical) Detection of Adversarial Examples

02/21/2017
by   Kathrin Grosse, et al.
0

Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or Malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem. As a step towards understanding adversarial examples, we show that they are not drawn from the same distribution than the original data, and can thus be detected using statistical tests. Using thus knowledge, we introduce a complimentary approach to identify specific inputs that are adversarial. Specifically, we augment our ML model with an additional output, in which the model is trained to classify all adversarial inputs. We evaluate our approach on multiple adversarial example crafting methods (including the fast gradient sign and saliency map methods) with several datasets. The statistical test flags sample sets containing adversarial inputs confidently at sample sizes between 10 and 100 data points. Furthermore, our augmented model either detects adversarial examples as outliers with high accuracy (> 80 cost - the perturbation added - by more than 150 statistical properties of adversarial examples are essential to their detection.

READ FULL TEXT
research
07/11/2022

Statistical Detection of Adversarial examples in Blockchain-based Federated Forest In-vehicle Network Intrusion Detection Systems

The internet-of-Vehicle (IoV) can facilitate seamless connectivity betwe...
research
12/19/2019

n-ML: Mitigating Adversarial Examples via Ensembles of Topologically Manipulated Classifiers

This paper proposes a new defense called n-ML against adversarial exampl...
research
02/18/2022

Rethinking Machine Learning Robustness via its Link with the Out-of-Distribution Problem

Despite multiple efforts made towards robust machine learning (ML) model...
research
10/02/2019

Generating Semantic Adversarial Examples with Differentiable Rendering

Machine learning (ML) algorithms, especially deep neural networks, have ...
research
11/02/2020

Adversarial Examples in Constrained Domains

Machine learning algorithms have been shown to be vulnerable to adversar...
research
12/12/2021

Quantifying and Understanding Adversarial Examples in Discrete Input Spaces

Modern classification algorithms are susceptible to adversarial examples...
research
08/26/2019

A Statistical Defense Approach for Detecting Adversarial Examples

Adversarial examples are maliciously modified inputs created to fool dee...

Please sign up or login with your details

Forgot password? Click here to reset