Adversarial Logit Pairing

03/16/2018
by   Harini Kannan, et al.
0

In this paper, we develop improved techniques for defending against adversarial examples at scale. First, we implement the state of the art version of adversarial training at unprecedented scale on ImageNet and investigate whether it remains effective in this setting - an important open scientific question (Athalye et al., 2018). Next, we introduce enhanced defenses using a technique we call logit pairing, a method that encourages logits for pairs of examples to be similar. When applied to clean examples and their adversarial counterparts, logit pairing improves accuracy on adversarial examples over vanilla adversarial training; we also find that logit pairing on clean examples only is competitive with adversarial training in terms of accuracy on two datasets. Finally, we show that adversarial logit pairing achieves the state of the art defense on ImageNet against PGD white box attacks, with an accuracy improvement from 1.5 damages the current state of the art defense against black box attacks on ImageNet (Tramer et al., 2018), dropping its accuracy from 66.6 this new accuracy drop, adversarial logit pairing ties with Tramer et al.(2018) for the state of the art on black box attacks on ImageNet.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2018

GenAttack: Practical Black-box Attacks with Gradient-Free Optimization

Deep neural networks (DNNs) are vulnerable to adversarial examples, even...
research
11/04/2016

Adversarial Machine Learning at Scale

Adversarial examples are malicious inputs designed to fool machine learn...
research
08/31/2021

Morphence: Moving Target Defense Against Adversarial Examples

Robustness to adversarial examples of machine learning models remains an...
research
09/10/2019

Toward Finding The Global Optimal of Adversarial Examples

Current machine learning models are vulnerable to adversarial examples (...
research
09/30/2020

Erratum Concerning the Obfuscated Gradients Attack on Stochastic Activation Pruning

Stochastic Activation Pruning (SAP) (Dhillon et al., 2018) is a defense ...
research
06/30/2018

A New Benchmark and Progress Toward Improved Weakly Supervised Learning

Knowledge Matters: Importance of Prior Information for Optimization [7],...
research
09/01/2020

Defending against substitute model black box adversarial attacks with the 01 loss

Substitute model black box attacks can create adversarial examples for a...

Please sign up or login with your details

Forgot password? Click here to reset