Query-Efficient Black-Box Attack by Active Learning

09/13/2018
by   Pengcheng Li, et al.
0

Deep neural network (DNN) as a popular machine learning model is found to be vulnerable to adversarial attack. This attack constructs adversarial examples by adding small perturbations to the raw input, while appearing unmodified to human eyes but will be misclassified by a well-trained classifier. In this paper, we focus on the black-box attack setting where attackers have almost no access to the underlying models. To conduct black-box attack, a popular approach aims to train a substitute model based on the information queried from the target DNN. The substitute model can then be attacked using existing white-box attack approaches, and the generated adversarial examples will be used to attack the target DNN. Despite its encouraging results, this approach suffers from poor query efficiency, i.e., attackers usually needs to query a huge amount of times to collect enough information for training an accurate substitute model. To this end, we first utilize state-of-the-art white-box attack methods to generate samples for querying, and then introduce an active learning strategy to significantly reduce the number of queries needed. Besides, we also propose a diversity criterion to avoid the sampling bias. Our extensive experimental results on MNIST and CIFAR-10 show that the proposed method can reduce more than 90% of queries while preserve attacking success rates and obtain an accurate substitute model which is more than 85% similar with the target oracle.

READ FULL TEXT
research
05/30/2018

AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks

Recent studies have shown that adversarial examples in state-of-the-art ...
research
08/27/2020

Adversarial Eigen Attack on Black-Box Models

Black-box adversarial attack has attracted a lot of research interests f...
research
08/31/2022

Unrestricted Adversarial Samples Based on Non-semantic Feature Clusters Substitution

Most current methods generate adversarial examples with the L_p norm spe...
research
02/16/2023

Marich: A Query-efficient Distributionally Equivalent Model Extraction Attack using Public Data

We study black-box model stealing attacks where the attacker can query a...
research
07/13/2023

Microbial Genetic Algorithm-based Black-box Attack against Interpretable Deep Learning Systems

Deep learning models are susceptible to adversarial samples in white and...
research
09/09/2021

Multi-granularity Textual Adversarial Attack with Behavior Cloning

Recently, the textual adversarial attack models become increasingly popu...
research
11/13/2018

Deep Q learning for fooling neural networks

Deep learning models are vulnerable to external attacks. In this paper, ...

Please sign up or login with your details

Forgot password? Click here to reset