Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing

12/14/2018
by   Jingyi Wang, et al.
0

Deep neural networks (DNN) have been shown to be useful in a wide range of applications. However, they are also known to be vulnerable to adversarial samples. By transforming a normal sample with some carefully crafted human non-perceptible perturbations, even highly accurate DNN makes wrong decisions. Multiple defense mechanisms have been proposed which aim to hinder the generation of such adversarial samples. However, a recent work show that most of them are ineffective. In this work, we propose an alternative approach to detect adversarial samples at runtime. Our main observation is that adversarial samples are much more sensitive than normal samples if we impose random mutations on the DNN. We thus first propose a measure of `sensitivity' and show empirically that normal samples and adversarial samples have distinguishable sensitivity. We then integrate statistical model checking and mutation testing to check whether an input sample is likely to be normal or adversarial at runtime by measuring its sensitivity. We evaluated our approach on the MNIST and CIFAR10 dataset. The results show that our approach detects adversarial samples generated by state-of-art attacking methods efficiently and accurately.

READ FULL TEXT
research
05/14/2018

Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing

Recently, it has been shown that deep neural networks (DNN) are subject ...
research
11/18/2018

The Taboo Trap: Behavioural Detection of Adversarial Samples

Deep Neural Networks (DNNs) have become a powerful tool for a wide range...
research
07/09/2021

GGT: Graph-Guided Testing for Adversarial Sample Detection of Deep Neural Network

Deep Neural Networks (DNN) are known to be vulnerable to adversarial sam...
research
08/23/2020

Ptolemy: Architecture Support for Robust Deep Learning

Deep learning is vulnerable to adversarial attacks, where carefully-craf...
research
03/01/2017

Detecting Adversarial Samples from Artifacts

Deep neural networks (DNNs) are powerful nonlinear architectures that ar...
research
01/25/2023

BDMMT: Backdoor Sample Detection for Language Models through Model Mutation Testing

Deep neural networks (DNNs) and natural language processing (NLP) system...
research
11/14/2015

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks

Deep learning algorithms have been shown to perform extremely well on ma...

Please sign up or login with your details

Forgot password? Click here to reset