Machine vs Machine: Minimax-Optimal Defense Against Adversarial Examples

11/12/2017
by   Jihun Hamm, et al.
0

Recently, researchers have discovered that the state-of-the-art object classifiers can be fooled easily by small perturbations in the input unnoticeable to human eyes. It is known that an attacker can generate strong adversarial examples if she knows the classifier parameters. Conversely, a defender can robustify the classifier by retraining if she has the adversarial examples. The cat-and-mouse game nature of attacks and defenses raises the question of the presence of equilibria in the dynamics. In this paper, we present a neural-network based attack class to approximate a larger but intractable class of attacks, and formulate the attacker-defender interaction as a zero-sum leader-follower game. We present sensitivity-penalized optimization algorithms to find minimax solutions, which are the best worst-case defenses against whitebox attacks. Advantages of the learning-based attacks and defenses compared to gradient-based attacks and defenses are demonstrated with MNIST and CIFAR-10.

READ FULL TEXT
research
04/26/2020

Harnessing adversarial examples with a surprisingly simple defense

I introduce a very simple method to defend against adversarial examples....
research
09/06/2018

Are adversarial examples inevitable?

A wide range of defenses have been proposed to harden neural networks ag...
research
05/23/2018

Robust Perception through Analysis by Synthesis

The intriguing susceptibility of deep neural networks to minimal input p...
research
10/24/2020

Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks

Adversarial Training is proved to be an efficient method to defend again...
research
02/13/2021

Mixed Nash Equilibria in the Adversarial Examples Game

This paper tackles the problem of adversarial examples from a game theor...
research
11/08/2018

New CleverHans Feature: Better Adversarial Robustness Evaluations with Attack Bundling

This technical report describes a new feature of the CleverHans library ...
research
06/05/2019

Enhancing Gradient-based Attacks with Symbolic Intervals

Recent breakthroughs in defenses against adversarial examples, like adve...

Please sign up or login with your details

Forgot password? Click here to reset