PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

08/20/2021
by   Chong Xiang, et al.
6

The adversarial patch attack against image classification models aims to inject adversarially crafted pixels within a localized restricted image region (i.e., a patch) for inducing model misclassification. This attack can be realized in the physical world by printing and attaching the patch to the victim object and thus imposes a real-world threat to computer vision systems. To counter this threat, we propose PatchCleanser as a certifiably robust defense against adversarial patches that is compatible with any image classifier. In PatchCleanser, we perform two rounds of pixel masking on the input image to neutralize the effect of the adversarial patch. In the first round of masking, we apply a set of carefully generated masks to the input image and evaluate the model prediction on every masked image. If model predictions on all one-masked images reach a unanimous agreement, we output the agreed prediction label. Otherwise, we perform a second round of masking to settle the disagreement, in which we evaluate model predictions on two-masked images to robustly recover the correct prediction label. Notably, we can prove that our defense will always make correct predictions on certain images against any adaptive white-box attacker within our threat model, achieving certified robustness. We extensively evaluate our defense on the ImageNet, ImageNette, CIFAR-10, CIFAR-100, SVHN, and Flowers-102 datasets and demonstrate that our defense achieves similar clean accuracy as state-of-the-art classification models and also significantly improves certified robustness from prior works. Notably, our defense can achieve 83.8 certified robust accuracy against a 2 1000-class ImageNet dataset.

READ FULL TEXT

page 1

page 2

page 18

page 19

research
05/17/2020

PatchGuard: Provable Defense against Adversarial Patches Using Masks on Small Receptive Fields

Localized adversarial patches aim to induce misclassification in machine...
research
06/22/2023

Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches

Certifiably robust defenses against adversarial patches for image classi...
research
04/26/2021

PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches

An adversarial patch can arbitrarily manipulate image pixels within a re...
research
11/19/2021

Zero-Shot Certified Defense against Adversarial Patches with Vision Transformers

Adversarial patch attack aims to fool a machine learning model by arbitr...
research
12/21/2021

Improving Robustness with Image Filtering

Adversarial robustness is one of the most challenging problems in Deep L...
research
03/16/2022

Towards Practical Certifiable Patch Defense with Vision Transformer

Patch attacks, one of the most threatening forms of physical attack in a...
research
06/09/2022

CARLA-GeAR: a Dataset Generator for a Systematic Evaluation of Adversarial Robustness of Vision Models

Adversarial examples represent a serious threat for deep neural networks...

Please sign up or login with your details

Forgot password? Click here to reset