Adversarial Detection without Model Information

02/09/2022
by   Abhishek Moitra, et al.
15

Most prior state-of-the-art adversarial detection works assume that the underlying vulnerable model is accessible, i,e., the model can be trained or its outputs are visible. However, this is not a practical assumption due to factors like model encryption, model information leakage and so on. In this work, we propose a model independent adversarial detection method using a simple energy function to distinguish between adversarial and natural inputs. We train a standalone detector independent of the underlying model, with sequential layer-wise training to increase the energy separation corresponding to natural and adversarial inputs. With this, we perform energy distribution-based adversarial detection. Our method achieves state-of-the-art detection performance (ROC-AUC > 0.9) across a wide range of gradient, score and decision-based adversarial attacks on CIFAR10, CIFAR100 and TinyImagenet datasets. Compared to prior approaches, our method requires  10-100x less number of operations and parameters for adversarial detection. Further, we show that our detection method is transferable across different datasets and adversarial attacks. For reproducibility, we provide code in the supplementary material.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2021

DetectX – Adversarial Input Detection using Current Signatures in Memristive XBar Arrays

Adversarial input detection has emerged as a prominent technique to hard...
research
05/19/2022

Focused Adversarial Attacks

Recent advances in machine learning show that neural models are vulnerab...
research
09/10/2019

Learning to Disentangle Robust and Vulnerable Features for Adversarial Detection

Although deep neural networks have shown promising performances on vario...
research
06/13/2023

Area is all you need: repeatable elements make stronger adversarial attacks

Over the last decade, deep neural networks have achieved state of the ar...
research
10/26/2019

Detection of Adversarial Attacks and Characterization of Adversarial Subspace

Adversarial attacks have always been a serious threat for any data-drive...
research
05/08/2021

Self-Supervised Adversarial Example Detection by Disentangled Representation

Deep learning models are known to be vulnerable to adversarial examples ...
research
05/30/2021

DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows

Despite much recent work, detecting out-of-distribution (OOD) inputs and...

Please sign up or login with your details

Forgot password? Click here to reset