Sign-OPT: A Query-Efficient Hard-label Adversarial Attack

09/24/2019
by   Minhao Cheng, et al.
17

We study the most practical problem setup for evaluating adversarial robustness of a machine learning system with limited access: the hard-label black-box attack setting for generating adversarial examples, where limited model queries are allowed and only the decision is provided to a queried data input. Several algorithms have been proposed for this problem but they typically require huge amount (>20,000) of queries for attacking one example. Among them, one of the state-of-the-art approaches (Cheng et al., 2019) showed that hard-label attack can be modeled as an optimization problem where the objective function can be evaluated by binary search with additional model queries, thereby a zeroth order optimization algorithm can be applied. In this paper, we adopt the same optimization formulation but propose to directly estimate the sign of gradient at any direction instead of the gradient itself, which enjoys the benefit of single query. Using this single query oracle for retrieving sign of directional derivative, we develop a novel query-efficient Sign-OPT approach for hard-label black-box attack. We provide a convergence analysis of the new algorithm and conduct experiments on several models on MNIST, CIFAR-10 and ImageNet. We find that Sign-OPT attack consistently requires 5X to 10X fewer queries when compared to the current state-of-the-art approaches, and usually converges to an adversarial example with smaller perturbation.

READ FULL TEXT
research
02/19/2019

There are No Bit Parts for Sign Bits in Black-Box Attacks

Machine learning models are vulnerable to adversarial examples. In this ...
research
02/17/2020

Query-Efficient Physical Hard-Label Attacks on Deep Learning Visual Classification

We present Survival-OPT, a physical adversarial example algorithm in the...
research
04/03/2019

Boundary Attack++: Query-Efficient Decision-Based Adversarial Attack

Decision-based adversarial attack studies the generation of adversarial ...
research
07/12/2018

Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach

We study the problem of attacking a machine learning model in the hard-l...
research
09/10/2019

Toward Finding The Global Optimal of Adversarial Examples

Current machine learning models are vulnerable to adversarial examples (...
research
10/08/2020

Gaussian MRF Covariance Modeling for Efficient Black-Box Adversarial Attacks

We study the problem of generating adversarial examples in a black-box s...
research
10/09/2017

Standard detectors aren't (currently) fooled by physical adversarial stop signs

An adversarial example is an example that has been adjusted to produce t...

Please sign up or login with your details

Forgot password? Click here to reset