RAID: Randomized Adversarial-Input Detection for Neural Networks

02/07/2020
by   Hasan Ferit Eniser, et al.
0

In recent years, neural networks have become the default choice for image classification and many other learning tasks, even though they are vulnerable to so-called adversarial attacks. To increase their robustness against these attacks, there have emerged numerous detection mechanisms that aim to automatically determine if an input is adversarial. However, state-of-the-art detection mechanisms either rely on being tuned for each type of attack, or they do not generalize across different attack types. To alleviate these issues, we propose a novel technique for adversarial-image detection, RAID, that trains a secondary classifier to identify differences in neuron activation values between benign and adversarial inputs. Our technique is both more reliable and more effective than the state of the art when evaluated against six popular attacks. Moreover, a straightforward extension of RAID increases its robustness against detection-aware adversaries without affecting its effectiveness.

READ FULL TEXT
research
01/31/2022

Adversarial Robustness in Deep Learning: Attacks on Fragile Neurons

We identify fragile and robust neurons of deep learning architectures us...
research
05/28/2022

Contributor-Aware Defenses Against Adversarial Backdoor Attacks

Deep neural networks for image classification are well-known to be vulne...
research
10/27/2018

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples

Adversarial sample attacks perturb benign inputs to induce DNN misbehavi...
research
11/22/2019

Attack Agnostic Statistical Method for Adversarial Detection

Deep Learning based AI systems have shown great promise in various domai...
research
09/23/2020

Detection of Iterative Adversarial Attacks via Counter Attack

Deep neural networks (DNNs) have proven to be powerful tools for process...
research
05/30/2021

DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows

Despite much recent work, detecting out-of-distribution (OOD) inputs and...
research
11/28/2018

A randomized gradient-free attack on ReLU networks

It has recently been shown that neural networks but also other classifie...

Please sign up or login with your details

Forgot password? Click here to reset