Manifold Assumption and Defenses Against Adversarial Perturbations

11/21/2017
by   Xi Wu, et al.
0

In the adversarial perturbation problem of neural networks, an adversary starts with a neural network model F and a point x that F classifies correctly, and identifies another point x', which is nearby x, that F classifies incorrectly. In this paper we consider a defense method that is based on the semantics of F. Our starting point is the common manifold assumption, which states that natural data points lie on separate low dimensional manifolds for different classes. We then make a further postulate which states that (a good model) F is confident on natural points on the manifolds, but has low confidence on points outside of the manifolds, where a natural measure of "confident behavior" is F( x)_∞ (i.e. how confident F is about its prediction). Under this postulate, an adversarial example becomes a point that is outside of the low dimensional manifolds which F has learned, but is still close to at least one manifold under some distance metric. Therefore, defending against adversarial perturbations becomes embedding an adversarial point back to the nearest manifold where natural points are drawn from. We propose algorithms to formalize this intuition and perform a preliminary evaluation. Noting that the effectiveness of our method depends on both how well F satisfies the postulate and how effective we can conduct the embedding, we use a model trained recently by Madry et al., as the base model, and use gradient based optimization, such as the Carlini-Wagner attack (but now they are used for defense), as the embedding procedure. Our preliminary results are encouraging: The base model wrapped with the embedding procedure achieves almost perfect success rate in defending against attacks that the base model fails on, while retaining the good generalization behavior of the base model.

READ FULL TEXT

page 11

page 12

research
11/21/2017

The Manifold Assumption and Defenses Against Adversarial Perturbations

In the adversarial-perturbation problem of neural networks, an adversary...
research
12/09/2017

NAG: Network for Adversary Generation

Adversarial perturbations can pose a serious threat for deploying machin...
research
03/01/2023

Adversarial Examples Exist in Two-Layer ReLU Networks for Low Dimensional Data Manifolds

Despite a great deal of research, it is still not well-understood why tr...
research
03/01/2022

Side-effects of Learning from Low Dimensional Data Embedded in an Euclidean Space

The low dimensional manifold hypothesis posits that the data found in ma...
research
03/05/2019

Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search

A plethora of recent work has shown that convolutional networks are not ...
research
03/03/2019

A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations

The linear and non-flexible nature of deep convolutional models makes th...
research
08/23/2016

On Clustering and Embedding Mixture Manifolds using a Low Rank Neighborhood Approach

Samples from intimate (non-linear) mixtures are generally modeled as bei...

Please sign up or login with your details

Forgot password? Click here to reset