Mitigating large adversarial perturbations on X-MAS (X minus Moving Averaged Samples)

by   Woohyung Chun, et al.

We propose the scheme that mitigates an adversarial perturbation ϵ on the adversarial example X_adv (=X±ϵ) by subtracting the estimated perturbation ϵ̂ from X+ϵ and adding ϵ̂ to X-ϵ. The estimated perturbation ϵ̂ comes from the difference between X_adv and its moving-averaged outcome W_avg*X_adv where W_avg is N × N moving average kernel that all the coefficients are one. Usually, the adjacent samples of an image are close to each other such that we can let X≈W_avg*X (naming this relation after X-MAS[X minus Moving Averaged Sample]). Since the X-MAS relation is approximately zero, the estimated perturbation can be less than the adversarial perturbation. The scheme is also extended to do the multi-level mitigation by configuring the mitigated adversarial example X_adv±ϵ̂ as a new adversarial example to be mitigated. The multi-level mitigation gets X_adv closer to X with a smaller (i.e. mitigated) perturbation than original unmitigated perturbation by setting W_avg * X_adv (<X+W_avg*ϵ if X≈W_avg*X) as the boundary condition that the multi-level mitigation cannot cross over (i.e. decreasing ϵ cannot go below and increasing ϵ cannot go beyond). With the multi-level mitigation, we can get high prediction accuracies even in the adversarial example having a large perturbation (i.e. ϵ≥16). The proposed scheme is evaluated with adversarial examples crafted by the Iterative FGSM (Fast Gradient Sign Method) on ResNet-50 trained with ImageNet dataset.



There are no comments yet.



Perturbation Analysis of Gradient-based Adversarial Attacks

After the discovery of adversarial examples and their adverse effects on...

Adversarial Training: embedding adversarial perturbations into the parameter space of a neural network to build a robust system

Adversarial training, in which a network is trained on both adversarial ...

Beating Attackers At Their Own Games: Adversarial Example Detection Using Adversarial Gradient Directions

Adversarial examples are input examples that are specifically crafted to...

Repairing Adversarial Texts through Perturbation

It is known that neural networks are subject to attacks through adversar...

Most ReLU Networks Suffer from ℓ^2 Adversarial Perturbations

We consider ReLU networks with random weights, in which the dimension de...

Multi-Step Adversarial Perturbations on Recommender Systems Embeddings

Recommender systems (RSs) have attained exceptional performance in learn...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.