DeepAI AI Chat
Log In Sign Up

Query-Free Adversarial Transfer via Undertrained Surrogates

by   Chris Miller, et al.
Dartmouth College

Deep neural networks have been shown to be highly vulnerable to adversarial examples—minor perturbations added to a model's input which cause the model to output an incorrect prediction. This vulnerability represents both a risk for the use of deep learning models in security-conscious fields and an opportunity to improve our understanding of how deep networks generalize to unexpected inputs. In a transfer attack, the adversary builds an adversarial attack using a surrogate model, then uses that attack to fool an unseen target model. Recent work in this subfield has focused on attack generation methods which can improve transferability between models. We show that optimizing a single surrogate model is a more effective method of improving adversarial transfer, using the simple example of an undertrained surrogate. This method transfers well across varied architectures and outperforms state-of-the-art methods. To interpret the effectiveness of undertrained surrogate models, we represent adversarial transferability as a function of surrogate model loss function curvature and similarity between surrogate and target gradients and show that our approach reduces the presence of local loss maxima which hinder transferability. Our results suggest that finding good single surrogate models is a highly effective and simple method for generating transferable adversarial attacks, and that this method represents a valuable route for future study in this field.


Generating Adversarial Examples withControllable Non-transferability

Adversarial attacks against Deep Neural Networks have been widely studie...

Bad Citrus: Reducing Adversarial Costs with Model Distances

Recent work by Jia et al., showed the possibility of effectively computi...

Understanding and Enhancing the Transferability of Adversarial Examples

State-of-the-art deep neural networks are known to be vulnerable to adve...

Transfer Attacks Revisited: A Large-Scale Empirical Study in Real Computer Vision Settings

One intriguing property of adversarial attacks is their "transferability...

Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation

Deep neural networks (DNNs) have been shown to be vulnerable to adversar...

TREND: Transferability based Robust ENsemble Design

Deep Learning models hold state-of-the-art performance in many fields, b...