Detecting Adversarial Samples from Artifacts

03/01/2017
by   Reuben Feinman, et al.
0

Deep neural networks (DNNs) are powerful nonlinear architectures that are known to be robust to random perturbations of the input. However, these models are vulnerable to adversarial perturbations--small input changes crafted explicitly to fool the model. In this paper, we ask whether a DNN can distinguish adversarial samples from their normal and noisy counterparts. We investigate model confidence on adversarial samples by looking at Bayesian uncertainty estimates, available in dropout neural networks, and by performing density estimation in the subspace of deep features learned by the model. The result is a method for implicit adversarial detection that is oblivious to the attack algorithm. We evaluate this method on a variety of standard datasets including MNIST and CIFAR-10 and show that it generalizes well across different architectures and attacks. Our findings report that 85-93 achieved on a number of standard classification tasks with a negative class that consists of both normal and noisy samples.

READ FULL TEXT

page 1

page 6

research
12/11/2020

Closeness and Uncertainty Aware Adversarial Examples Detection in Adversarial Machine Learning

Deep neural network (DNN) architectures are considered to be robust to r...
research
02/08/2021

Exploiting epistemic uncertainty of the deep learning models to generate adversarial samples

Deep neural network architectures are considered to be robust to random ...
research
10/24/2017

One pixel attack for fooling deep neural networks

Recent research has revealed that the output of Deep Neural Networks (DN...
research
05/01/2022

DDDM: a Brain-Inspired Framework for Robust Classification

Despite their outstanding performance in a broad spectrum of real-world ...
research
01/22/2019

Sensitivity Analysis of Deep Neural Networks

Deep neural networks (DNNs) have achieved superior performance in variou...
research
12/14/2018

Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing

Deep neural networks (DNN) have been shown to be useful in a wide range ...
research
05/05/2017

Detecting Adversarial Samples Using Density Ratio Estimates

Machine learning models, especially based on deep architectures are used...

Please sign up or login with your details

Forgot password? Click here to reset