Low Frequency Adversarial Perturbation

09/24/2018

∙

Recently, machine learning security has received significant attention. Many computer vision and speech recognition systems have been compromised by adversarially but imperceptibly perturbed input. To identify potential perturbations, attackers search the high dimensional input space to find directions in which the model lacks robustness. The exponential number of such directions makes the existence of these adversarial perturbations likely, but also creates significant challenges in the black-box setting: First, in the absence of gradient information the search problem becomes expensive, resulting in high query complexity. Second, the constructed perturbations are typically high-frequency in nature and can be successfully defended against through denoising transformations. In this paper we propose to restrict the search for adversarial images to a low frequency domain. This approach is compatible with existing white-box and black-box attacks, and has remarkable benefits in the latter setting. In particular, we achieve state-of-the-art black-box query efficiency and improve over prior work by an order of magnitude. Further, we can circumvent image transformation defenses even when both the model and the defense strategy are unknown. Finally, we demonstrate the efficacy of this technique by fooling the Google Cloud Vision platform with an unprecedented low number of model queries.

READ FULL TEXT

Low Frequency Adversarial Perturbation

Sign in with Google

Consider DeepAI Pro