Mask and Restore: Blind Backdoor Defense at Test Time with Masked Autoencoder

03/27/2023
by   Tao Sun, et al.
0

Deep neural networks are vulnerable to backdoor attacks, where an adversary maliciously manipulates the model behavior through overlaying images with special triggers. Existing backdoor defense methods often require accessing a few validation data and model parameters, which are impractical in many real-world applications, e.g., when the model is provided as a cloud service. In this paper, we address the practical task of blind backdoor defense at test time, in particular for black-box models. The true label of every test image needs to be recovered on the fly from the hard label predictions of a suspicious model. The heuristic trigger search in image space, however, is not scalable to complex triggers or high image resolution. We circumvent such barrier by leveraging generic image generation models, and propose a framework of Blind Defense with Masked AutoEncoder (BDMAE). It uses the image structural similarity and label consistency between the test image and MAE restorations to detect possible triggers. The detection result is refined by considering the topology of triggers. We obtain a purified test image from restorations for making prediction. Our approach is blind to the model architectures, trigger patterns or image benignity. Extensive experiments on multiple datasets with different backdoor attacks validate its effectiveness and generalizability. Code is available at https://github.com/tsun/BDMAE.

READ FULL TEXT

page 7

page 8

page 10

page 12

research
03/27/2023

Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency

Deep neural networks are proven to be vulnerable to backdoor attacks. De...
research
09/10/2023

DAD++: Improved Data-free Test Time Adversarial Defense

With the increasing deployment of deep neural networks in safety-critica...
research
02/07/2023

SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency

Deep neural networks (DNNs) are vulnerable to backdoor attacks, where ad...
research
03/27/2022

How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

The lack of adversarial robustness has been recognized as an important i...
research
06/16/2022

Backdoor Attacks on Vision Transformers

Vision Transformers (ViT) have recently demonstrated exemplary performan...
research
02/05/2022

Memory Defense: More Robust Classification via a Memory-Masking Autoencoder

Many deep neural networks are susceptible to minute perturbations of ima...
research
01/26/2023

Distilling Cognitive Backdoor Patterns within an Image

This paper proposes a simple method to distill and detect backdoor patte...

Please sign up or login with your details

Forgot password? Click here to reset