Adversarial Example Games

07/01/2020
by   Avishek Joey Bose, et al.
13

The existence of adversarial examples capable of fooling trained neural network classifiers calls for a much better understanding of possible attacks, in order to guide the development of safeguards against them. It includes attack methods in the highly challenging non-interactive blackbox setting, where adversarial attacks are generated without any access, including queries, to the target model. Prior works in this setting have relied mainly on algorithmic innovations derived from empirical observations (e.g., that momentum helps), and the field currently lacks a firm theoretical basis for understanding transferability in adversarial attacks. In this work, we address this gap and lay the theoretical foundations for crafting transferable adversarial examples to entire function classes. We introduce Adversarial Examples Games (AEG), a novel framework that models adversarial examples as two-player min-max games between an attack generator and a representative classifier. We prove that the saddle point of an AEG game corresponds to a generating distribution of adversarial examples against entire function classes. Training the generator only requires the ability to optimize a representative classifier from a given hypothesis class, enabling BlackBox transfer to unseen classifiers from the same class. We demonstrate the efficacy of our approach on the MNIST and CIFAR-10 datasets against both undefended and robustified models, achieving competitive performance with state-of-the-art BlackBox transfer approaches.

READ FULL TEXT
research
06/01/2022

On the reversibility of adversarial attacks

Adversarial attacks modify images with perturbations that change the pre...
research
03/19/2020

Breaking certified defenses: Semantic adversarial examples with spoofed robustness certificates

To deflect adversarial attacks, a range of "certified" classifiers have ...
research
05/30/2021

Generating Adversarial Examples with Graph Neural Networks

Recent years have witnessed the deployment of adversarial attacks to eva...
research
10/04/2019

Adversarial Examples for Cost-Sensitive Classifiers

Motivated by safety-critical classification problems, we investigate adv...
research
06/03/2021

A Little Robustness Goes a Long Way: Leveraging Universal Features for Targeted Transfer Attacks

Adversarial examples for neural network image classifiers are known to b...
research
06/19/2020

Differentiable Language Model Adversarial Attacks on Categorical Sequence Classifiers

An adversarial attack paradigm explores various scenarios for the vulner...
research
12/08/2017

CycleGAN: a Master of Steganography

CycleGAN is one of the latest successful approaches to learn a correspon...

Please sign up or login with your details

Forgot password? Click here to reset