A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

06/26/2020
by   Kaidi Jin, et al.
0

Deep Neural Networks are well known to be vulnerable to adversarial attacks and backdoor attacks, where minor modifications on the input can mislead the models to give wrong results. Although defenses against adversarial attacks have been widely studied, research on mitigating backdoor attacks is still at an early stage. It is unknown whether there are any connections and common characteristics between the defenses against these two attacks. In this paper, we present a unified framework for detecting malicious examples and protecting the inference results of Deep Learning models. This framework is based on our observation that both adversarial examples and backdoor examples have anomalies during the inference process, highly distinguishable from benign samples. As a result, we repurpose and revise four existing adversarial defense methods for detecting backdoor examples. Extensive evaluations indicate these approaches provide reliable protection against backdoor attacks, with a higher accuracy than detecting adversarial examples. These solutions also reveal the relations of adversarial examples, backdoor examples and normal samples in model sensitivity, activation space and feature space. This can enhance our understanding about the inherent features of these two attacks, as well as the defense opportunities.

READ FULL TEXT

page 7

page 13

research
11/13/2019

Adversarial Examples in Modern Machine Learning: A Review

Recent research has found that many families of machine learning models ...
research
12/02/2021

A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

The generation of feasible adversarial examples is necessary for properl...
research
12/03/2020

FenceBox: A Platform for Defeating Adversarial Examples with Data Augmentation Techniques

It is extensively studied that Deep Neural Networks (DNNs) are vulnerabl...
research
04/19/2021

Removing Adversarial Noise in Class Activation Feature Space

Deep neural networks (DNNs) are vulnerable to adversarial noise. Preproc...
research
09/23/2019

Adversarial Examples for Deep Learning Cyber Security Analytics

As advances in Deep Neural Networks demonstrate unprecedented levels of ...
research
10/18/2020

FADER: Fast Adversarial Example Rejection

Deep neural networks are vulnerable to adversarial examples, i.e., caref...
research
08/19/2021

Exploiting Multi-Object Relationships for Detecting Adversarial Attacks in Complex Scenes

Vision systems that deploy Deep Neural Networks (DNNs) are known to be v...

Please sign up or login with your details

Forgot password? Click here to reset