Mitigating large adversarial perturbations on X-MAS (X minus Moving Averaged Samples)
We propose the scheme that mitigates an adversarial perturbation ϵ on the adversarial example X_adv (=X±ϵ) by subtracting the estimated perturbation ϵ̂ from X+ϵ and adding ϵ̂ to X-ϵ. The estimated perturbation ϵ̂ comes from the difference between X_adv and its moving-averaged outcome W_avg*X_adv where W_avg is N × N moving average kernel that all the coefficients are one. Usually, the adjacent samples of an image are close to each other such that we can let X≈W_avg*X (naming this relation after X-MAS[X minus Moving Averaged Sample]). Since the X-MAS relation is approximately zero, the estimated perturbation can be less than the adversarial perturbation. The scheme is also extended to do the multi-level mitigation by configuring the mitigated adversarial example X_adv±ϵ̂ as a new adversarial example to be mitigated. The multi-level mitigation gets X_adv closer to X with a smaller (i.e. mitigated) perturbation than original unmitigated perturbation by setting W_avg * X_adv (<X+W_avg*ϵ if X≈W_avg*X) as the boundary condition that the multi-level mitigation cannot cross over (i.e. decreasing ϵ cannot go below and increasing ϵ cannot go beyond). With the multi-level mitigation, we can get high prediction accuracies even in the adversarial example having a large perturbation (i.e. ϵ≥16). The proposed scheme is evaluated with adversarial examples crafted by the Iterative FGSM (Fast Gradient Sign Method) on ResNet-50 trained with ImageNet dataset.
READ FULL TEXT