Adversarial Machine Learning at Scale

11/04/2016
by   Alexey Kurakin, et al.
0

Adversarial examples are malicious inputs designed to fool machine learning models. They often transfer from one model to another, allowing attackers to mount black box attacks without knowledge of the target model's parameters. Adversarial training is the process of explicitly training a model on adversarial examples, in order to make it more robust to attack or to reduce its test error on clean inputs. So far, adversarial training has primarily been applied to small problems. In this research, we apply adversarial training to ImageNet. Our contributions include: (1) recommendations for how to succesfully scale adversarial training to large models and datasets, (2) the observation that adversarial training confers robustness to single-step attack methods, (3) the finding that multi-step attack methods are somewhat less transferable than single-step attack methods, so single-step attacks are the best for mounting black-box attacks, and (4) resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples, because the adversarial example construction process uses the true label and the model can learn to exploit regularities in the construction process.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2017

Ensemble Adversarial Training: Attacks and Defenses

Machine learning models are vulnerable to adversarial examples, inputs m...
research
03/29/2021

Lagrangian Objective Function Leads to Improved Unforeseen Attack Generalization in Adversarial Training

Recent improvements in deep learning models and their practical applicat...
research
12/22/2020

Self-Progressing Robust Training

Enhancing model robustness under new and even adversarial environments i...
research
10/03/2016

cleverhans v2.0.0: an adversarial machine learning library

cleverhans is a software library that provides standardized reference im...
research
04/14/2022

Planting Undetectable Backdoors in Machine Learning Models

Given the computational cost and technical expertise required to train m...
research
02/26/2019

Design of intentional backdoors in sequential models

Recent work has demonstrated robust mechanisms by which attacks can be o...
research
03/16/2018

Adversarial Logit Pairing

In this paper, we develop improved techniques for defending against adve...

Please sign up or login with your details

Forgot password? Click here to reset