Certifiable Black-Box Attack: Ensuring Provably Successful Attack for Adversarial Examples

04/10/2023
by   Hanbin Hong, et al.
0

Black-box adversarial attacks have shown strong potential to subvert machine learning models. Existing black-box adversarial attacks craft the adversarial examples by iteratively querying the target model and/or leveraging the transferability of a local surrogate model. Whether such attack can succeed remains unknown to the adversary when empirically designing the attack. In this paper, to our best knowledge, we take the first step to study a new paradigm of adversarial attacks – certifiable black-box attack that can guarantee the attack success rate of the crafted adversarial examples. Specifically, we revise the randomized smoothing to establish novel theories for ensuring the attack success rate of the adversarial examples. To craft the adversarial examples with the certifiable attack success rate (CASR) guarantee, we design several novel techniques, including a randomized query method to query the target model, an initialization method with smoothed self-supervised perturbation to derive certifiable adversarial examples, and a geometric shifting method to reduce the perturbation size of the certifiable adversarial examples for better imperceptibility. We have comprehensively evaluated the performance of the certifiable black-box attack on CIFAR10 and ImageNet datasets against different levels of defenses. Both theoretical and experimental results have validated the effectiveness of the proposed certifiable attack.

READ FULL TEXT

page 7

page 22

research
08/17/2017

Machine Learning as an Adversarial Service: Learning Black-Box Adversarial Examples

Neural networks are known to be vulnerable to adversarial examples, inpu...
research
09/05/2021

Training Meta-Surrogate Model for Transferable Adversarial Attack

We consider adversarial attacks to a black-box model when no queries are...
research
05/01/2019

POBA-GA: Perturbation Optimized Black-Box Adversarial Attacks via Genetic Algorithm

Most deep learning models are easily vulnerable to adversarial attacks. ...
research
12/24/2018

Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks

We consider adversarial examples in the black-box decision-based scenari...
research
09/03/2021

A Synergetic Attack against Neural Network Classifiers combining Backdoor and Adversarial Examples

In this work, we show how to jointly exploit adversarial perturbation an...
research
02/15/2021

CAP-GAN: Towards Adversarial Robustness with Cycle-consistent Attentional Purification

Adversarial attack is aimed at fooling the target classifier with imperc...
research
01/31/2022

MEGA: Model Stealing via Collaborative Generator-Substitute Networks

Deep machine learning models are increasingly deployedin the wild for pr...

Please sign up or login with your details

Forgot password? Click here to reset