Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches

by   Aniruddha Saha, et al.

Certifiably robust defenses against adversarial patches for image classifiers ensure correct prediction against any changes to a constrained neighborhood of pixels. PatchCleanser arXiv:2108.09135 [cs.CV], the state-of-the-art certified defense, uses a double-masking strategy for robust classification. The success of this strategy relies heavily on the model's invariance to image pixel masking. In this paper, we take a closer look at model training schemes to improve this invariance. Instead of using Random Cutout arXiv:1708.04552v2 [cs.CV] augmentations like PatchCleanser, we introduce the notion of worst-case masking, i.e., selecting masked images which maximize classification loss. However, finding worst-case masks requires an exhaustive search, which might be prohibitively expensive to do on-the-fly during training. To solve this problem, we propose a two-round greedy masking strategy (Greedy Cutout) which finds an approximate worst-case mask location with much less compute. We show that the models trained with our Greedy Cutout improves certified robust accuracy over Random Cutout in PatchCleanser across a range of datasets and architectures. Certified robust accuracy on ImageNet with a ViT-B16-224 model increases from 58.1% to 62.3% against a 3% square patch applied anywhere on the image.


page 2

page 5

page 6


PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

The adversarial patch attack against image classification models aims to...

PatchGuard: Provable Defense against Adversarial Patches Using Masks on Small Receptive Fields

Localized adversarial patches aim to induce misclassification in machine...

Zero-Shot Certified Defense against Adversarial Patches with Vision Transformers

Adversarial patch attack aims to fool a machine learning model by arbitr...

ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers

Adversarial patch attacks that craft the pixels in a confined region of ...

Defending Against Image Corruptions Through Adversarial Augmentations

Modern neural networks excel at image classification, yet they remain vu...

Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness

This work provides theoretical and empirical evidence that invariance-in...

Increasing the robustness of DNNs against image corruptions by playing the Game of Noise

The human visual system is remarkably robust against a wide range of nat...

Please sign up or login with your details

Forgot password? Click here to reset