A Certified Radius-Guided Attack Framework to Image Segmentation Models

04/05/2023
by   Wenjie Qu, et al.
0

Image segmentation is an important problem in many safety-critical applications. Recent studies show that modern image segmentation models are vulnerable to adversarial perturbations, while existing attack methods mainly follow the idea of attacking image classification models. We argue that image segmentation and classification have inherent differences, and design an attack framework specially for image segmentation models. Our attack framework is inspired by certified radius, which was originally used by defenders to defend against adversarial perturbations to classification models. We are the first, from the attacker perspective, to leverage the properties of certified radius and propose a certified radius guided attack framework against image segmentation models. Specifically, we first adapt randomized smoothing, the state-of-the-art certification method for classification models, to derive the pixel's certified radius. We then focus more on disrupting pixels with relatively smaller certified radii and design a pixel-wise certified radius guided loss, when plugged into any existing white-box attack, yields our certified radius-guided white-box attack. Next, we propose the first black-box attack to image segmentation models via bandit. We design a novel gradient estimator, based on bandit feedback, which is query-efficient and provably unbiased and stable. We use this gradient estimator to design a projected bandit gradient descent (PBGD) attack, as well as a certified radius-guided PBGD (CR-PBGD) attack. We prove our PBGD and CR-PBGD attacks can achieve asymptotically optimal attack performance with an optimal rate. We evaluate our certified-radius guided white-box and black-box attacks on multiple modern image segmentation models and datasets. Our results validate the effectiveness of our certified radius-guided attack framework.

READ FULL TEXT

page 1

page 6

research
07/12/2021

EvoBA: An Evolution Strategy as a Strong Baseline forBlack-Box Adversarial Attacks

Recent work has shown how easily white-box adversarial attacks can be ap...
research
02/18/2020

Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural Gradient Descent

Despite the great achievements of the modern deep neural networks (DNNs)...
research
06/15/2020

Efficient Black-Box Adversarial Attack Guided by the Distribution of Adversarial Perturbations

This work studied the score-based black-box adversarial attack problem, ...
research
08/03/2022

Multiclass ASMA vs Targeted PGD Attack in Image Segmentation

Deep learning networks have demonstrated high performance in a large var...
research
04/19/2018

Attacking Convolutional Neural Network using Differential Evolution

The output of Convolutional Neural Networks (CNN) has been shown to be d...
research
10/13/2021

Adversarial Attack across Datasets

It has been observed that Deep Neural Networks (DNNs) are vulnerable to ...
research
11/25/2022

Invariance-Aware Randomized Smoothing Certificates

Building models that comply with the invariances inherent to different d...

Please sign up or login with your details

Forgot password? Click here to reset