Mitigating large adversarial perturbations on X-MAS (X minus Moving Averaged Samples)

12/19/2019
by   Woohyung Chun, et al.
0

We propose the scheme that mitigates an adversarial perturbation ϵ on the adversarial example X_adv (=X±ϵ) by subtracting the estimated perturbation ϵ̂ from X+ϵ and adding ϵ̂ to X-ϵ. The estimated perturbation ϵ̂ comes from the difference between X_adv and its moving-averaged outcome W_avg*X_adv where W_avg is N × N moving average kernel that all the coefficients are one. Usually, the adjacent samples of an image are close to each other such that we can let X≈W_avg*X (naming this relation after X-MAS[X minus Moving Averaged Sample]). Since the X-MAS relation is approximately zero, the estimated perturbation can be less than the adversarial perturbation. The scheme is also extended to do the multi-level mitigation by configuring the mitigated adversarial example X_adv±ϵ̂ as a new adversarial example to be mitigated. The multi-level mitigation gets X_adv closer to X with a smaller (i.e. mitigated) perturbation than original unmitigated perturbation by setting W_avg * X_adv (<X+W_avg*ϵ if X≈W_avg*X) as the boundary condition that the multi-level mitigation cannot cross over (i.e. decreasing ϵ cannot go below and increasing ϵ cannot go beyond). With the multi-level mitigation, we can get high prediction accuracies even in the adversarial example having a large perturbation (i.e. ϵ≥16). The proposed scheme is evaluated with adversarial examples crafted by the Iterative FGSM (Fast Gradient Sign Method) on ResNet-50 trained with ImageNet dataset.

READ FULL TEXT
research
06/02/2020

Perturbation Analysis of Gradient-based Adversarial Attacks

After the discovery of adversarial examples and their adverse effects on...
research
10/09/2019

Adversarial Training: embedding adversarial perturbations into the parameter space of a neural network to build a robust system

Adversarial training, in which a network is trained on both adversarial ...
research
12/31/2020

Beating Attackers At Their Own Games: Adversarial Example Detection Using Adversarial Gradient Directions

Adversarial examples are input examples that are specifically crafted to...
research
12/29/2021

Repairing Adversarial Texts through Perturbation

It is known that neural networks are subject to attacks through adversar...
research
02/15/2018

ASP:A Fast Adversarial Attack Example Generation Framework based on Adversarial Saliency Prediction

With the excellent accuracy and feasibility, the Neural Networks have be...
research
06/19/2018

Maximally Invariant Data Perturbation as Explanation

While several feature scoring methods are proposed to explain the output...

Please sign up or login with your details

Forgot password? Click here to reset