Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality

by   Xingjun Ma, et al.

Deep Neural Networks (DNNs) have recently been shown to be vulnerable against adversarial examples, which are carefully crafted instances that can mislead DNNs to make errors during prediction. To better understand such attacks, a characterization is needed of the properties of regions (the so-called `adversarial subspaces') in which adversarial examples lie. In particular, effective measures are required to discriminate adversarial examples from normal examples in such regions. We tackle this challenge by characterizing the dimensional properties of adversarial regions, via the use of Local Intrinsic Dimensionality (LID). LID assesses the space-filling capability of the region surrounding a reference example, based on the distance distribution of the example to its neighbors. We first provide explanations about how adversarial perturbation can affect the LID characteristic of adversarial regions, and then show empirically that LID characteristics can facilitate the detection of adversarial examples generated using the state-of-the-art attacks. We show that when applied for adversarial detection, an LID-based method can outperform several state-of-the-art detection measures by large margins for five attack strategies across three benchmark datasets. Our analysis of the LID characteristic for adversarial regions not only motivates new directions of effective adversarial defense, but also opens up more challenges for developing new attacks to better understand the vulnerabilities of DNNs.


page 1

page 2

page 3

page 4


On the Limitation of Local Intrinsic Dimensionality for Characterizing the Subspaces of Adversarial Examples

Understanding and characterizing the subspaces of adversarial examples a...

Characterizing Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation

Deep Neural Networks (DNNs) have been widely applied in various recognit...

When Explainability Meets Adversarial Learning: Detecting Adversarial Examples using SHAP Signatures

State-of-the-art deep neural networks (DNNs) are highly effective in sol...

Detecting Adversarial Examples from Sensitivity Inconsistency of Spatial-Transform Domain

Deep neural networks (DNNs) have been shown to be vulnerable against adv...

Spatially transformed adversarial examples

Recent studies show that widely used deep neural networks (DNNs) are vul...

Learning To Characterize Adversarial Subspaces

Deep Neural Networks (DNNs) are known to be vulnerable to the maliciousl...

Note: An alternative proof of the vulnerability of k-NN classifiers in high intrinsic dimensionality regions

This document proposes an alternative proof of the result contained in a...

Please sign up or login with your details

Forgot password? Click here to reset