Purifying Adversarial Perturbation with Adversarially Trained Auto-encoders

05/26/2019
by   Hebi Li, et al.
0

Machine learning models are vulnerable to adversarial examples. Iterative adversarial training has shown promising results against strong white-box attacks. However, adversarial training is very expensive, and every time a model needs to be protected, such expensive training scheme needs to be performed. In this paper, we propose to apply iterative adversarial training scheme to an external auto-encoder, which once trained can be used to protect other models directly. We empirically show that our model outperforms other purifying-based methods against white-box attacks, and transfers well to directly protect other base models with different architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2017

Ensemble Adversarial Training: Attacks and Defenses

Machine learning models are vulnerable to adversarial examples, inputs m...
research
07/16/2022

Towards the Desirable Decision Boundary by Moderate-Margin Adversarial Training

Adversarial training, as one of the most effective defense methods again...
research
08/06/2018

Gray-box Adversarial Training

Adversarial samples are perturbed inputs crafted to mislead the machine ...
research
03/25/2019

Robust Neural Networks using Randomized Adversarial Training

Since the discovery of adversarial examples in machine learning, researc...
research
04/18/2020

Single-step Adversarial training with Dropout Scheduling

Deep learning models have shown impressive performance across a spectrum...
research
10/09/2019

Deep Latent Defence

Deep learning methods have shown state of the art performance in a range...
research
02/03/2020

Regularizers for Single-step Adversarial Training

The progress in the last decade has enabled machine learning models to a...

Please sign up or login with your details

Forgot password? Click here to reset