Using Anomaly Feature Vectors for Detecting, Classifying and Warning of Outlier Adversarial Examples

07/01/2021
by   Nelson Manohar-Alers, et al.
3

We present DeClaW, a system for detecting, classifying, and warning of adversarial inputs presented to a classification neural network. In contrast to current state-of-the-art methods that, given an input, detect whether an input is clean or adversarial, we aim to also identify the types of adversarial attack (e.g., PGD, Carlini-Wagner or clean). To achieve this, we extract statistical profiles, which we term as anomaly feature vectors, from a set of latent features. Preliminary findings suggest that AFVs can help distinguish among several types of adversarial attacks (e.g., PGD versus Carlini-Wagner) with close to 93 to using AFV-based methods for exploring not only adversarial attack detection but also classification of the attack type and then design of attack-specific mitigation strategies.

READ FULL TEXT

page 2

page 4

page 12

page 13

research
01/31/2023

Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

Vision Transformers (ViTs) are becoming a very popular paradigm for visi...
research
01/21/2021

Adv-OLM: Generating Textual Adversaries via OLM

Deep learning models are susceptible to adversarial examples that have i...
research
07/01/2020

Determining Sequence of Image Processing Technique (IPT) to Detect Adversarial Attacks

Developing secure machine learning models from adversarial examples is c...
research
12/11/2020

Random Projections for Adversarial Attack Detection

Whilst adversarial attack detection has received considerable attention,...
research
11/14/2021

Generating Band-Limited Adversarial Surfaces Using Neural Networks

Generating adversarial examples is the art of creating a noise that is a...
research
03/23/2022

Input-specific Attention Subnetworks for Adversarial Detection

Self-attention heads are characteristic of Transformer models and have b...

Please sign up or login with your details

Forgot password? Click here to reset