Certified Zeroth-order Black-Box Defense with Robust UNet Denoiser

04/13/2023
by   Astha Verma, et al.
0

Certified defense methods against adversarial perturbations have been recently investigated in the black-box setting with a zeroth-order (ZO) perspective. However, these methods suffer from high model variance with low performance on high-dimensional datasets due to the ineffective design of the denoiser and are limited in their utilization of ZO techniques. To this end, we propose a certified ZO preprocessing technique for removing adversarial perturbations from the attacked image in the black-box setting using only model queries. We propose a robust UNet denoiser (RDUNet) that ensures the robustness of black-box models trained on high-dimensional datasets. We propose a novel black-box denoised smoothing (DS) defense mechanism, ZO-RUDS, by prepending our RDUNet to the black-box model, ensuring black-box defense. We further propose ZO-AE-RUDS in which RDUNet followed by autoencoder (AE) is prepended to the black-box model. We perform extensive experiments on four classification datasets, CIFAR-10, CIFAR-10, Tiny Imagenet, STL-10, and the MNIST dataset for image reconstruction tasks. Our proposed defense methods ZO-RUDS and ZO-AE-RUDS beat SOTA with a huge margin of 35% and 9%, for low dimensional (CIFAR-10) and with a margin of 20.61% and 23.51% for high-dimensional (STL-10) datasets, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/27/2022

How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

The lack of adversarial robustness has been recognized as an important i...
research
09/24/2018

Low Frequency Adversarial Perturbation

Recently, machine learning security has received significant attention. ...
research
05/11/2020

Spanning Attack: Reinforce Black-box Attacks with Unlabeled Data

Adversarial black-box attacks aim to craft adversarial perturbations by ...
research
06/22/2021

Self-Supervised Iterative Contextual Smoothing for Efficient Adversarial Defense against Gray- and Black-Box Attack

We propose a novel and effective input transformation based adversarial ...
research
09/12/2023

Exploring Non-additive Randomness on ViT against Query-Based Black-Box Attacks

Deep Neural Networks can be easily fooled by small and imperceptible per...
research
03/21/2023

Black-box Backdoor Defense via Zero-shot Image Purification

Backdoor attacks inject poisoned data into the training set, resulting i...
research
03/07/2023

Bootstrap The Original Latent: Learning a Private Model from a Black-box Model

In this paper, considering the balance of data/model privacy of model ow...

Please sign up or login with your details

Forgot password? Click here to reset