PatchGuard: Provable Defense against Adversarial Patches Using Masks on Small Receptive Fields

05/17/2020
by   Chong Xiang, et al.
0

Localized adversarial patches aim to induce misclassification in machine learning models by arbitrarily modifying pixels within a restricted region of an image. Such attacks can be realized in the physical world by attaching the adversarial patch to the object to be misclassified. In this paper, we propose a general defense framework that can achieve both high clean accuracy and provable robustness against localized adversarial patches. The cornerstone of our defense framework is to use a convolutional network with small receptive fields that impose a bound on the number of features corrupted by an adversarial patch. We further present the robust masking defense that robustly detects and masks corrupted features for a secure feature aggregation. We evaluate our defense against the most powerful white-box untargeted adaptive attacker and achieve a 92.3 accuracy on a 10-class subset of ImageNet against a 31x31 adversarial patch (2 pixels), a 57.4 1000-class ImageNet against a 31x31 patch (2 accuracy and a 61.3 pixels). Notably, our provable defenses achieve state-of-the-art provable robust accuracy on ImageNet and CIFAR-10.

READ FULL TEXT

page 1

page 4

research
04/26/2021

PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches

An adversarial patch can arbitrarily manipulate image pixels within a re...
research
08/20/2021

PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

The adversarial patch attack against image classification models aims to...
research
11/19/2021

Zero-Shot Certified Defense against Adversarial Patches with Vision Transformers

Adversarial patch attack aims to fool a machine learning model by arbitr...
research
10/27/2021

ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers

Adversarial patch attacks that craft the pixels in a confined region of ...
research
06/22/2023

Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches

Certifiably robust defenses against adversarial patches for image classi...
research
03/16/2022

Towards Practical Certifiable Patch Defense with Vision Transformer

Patch attacks, one of the most threatening forms of physical attack in a...
research
01/19/2021

On Provable Backdoor Defense in Collaborative Learning

As collaborative learning allows joint training of a model using multipl...

Please sign up or login with your details

Forgot password? Click here to reset