Robust Transferable Feature Extractors: Learning to Defend Pre-Trained Networks Against White Box Adversaries

09/14/2022
by   Alexander Cann, et al.
0

The widespread adoption of deep neural networks in computer vision applications has brought forth a significant interest in adversarial robustness. Existing research has shown that maliciously perturbed inputs specifically tailored for a given model (i.e., adversarial examples) can be successfully transferred to another independently trained model to induce prediction errors. Moreover, this property of adversarial examples has been attributed to features derived from predictive patterns in the data distribution. Thus, we are motivated to investigate the following question: Can adversarial defenses, like adversarial examples, be successfully transferred to other independently trained models? To this end, we propose a deep learning-based pre-processing mechanism, which we refer to as a robust transferable feature extractor (RTFE). After examining theoretical motivation and implications, we experimentally show that our method can provide adversarial robustness to multiple independently pre-trained classifiers that are otherwise ineffective against an adaptive white box adversary. Furthermore, we show that RTFEs can even provide one-shot adversarial robustness to models independently trained on different datasets.

READ FULL TEXT
research
04/10/2018

On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses

Neural networks are known to be vulnerable to adversarial examples. In t...
research
02/01/2019

The Efficacy of SHIELD under Different Threat Models

We study the efficacy of SHIELD in the face of alternative threat models...
research
05/06/2019

Adversarial Examples Are Not Bugs, They Are Features

Adversarial examples have attracted significant attention in machine lea...
research
06/10/2021

Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training

Deep neural networks (DNNs) are vulnerable to adversarial noise. A range...
research
10/01/2019

Deep Neural Rejection against Adversarial Examples

Despite the impressive performances reported by deep neural networks in ...
research
11/27/2019

Can Attention Masks Improve Adversarial Robustness?

Deep Neural Networks (DNNs) are known to be susceptible to adversarial e...
research
11/23/2022

Adversarial Attacks are a Surprisingly Strong Baseline for Poisoning Few-Shot Meta-Learners

This paper examines the robustness of deployed few-shot meta-learning sy...

Please sign up or login with your details

Forgot password? Click here to reset