Adversarial Defense by Latent Style Transformations

06/17/2020
by   Shuo Wang, et al.
0

Machine learning models have demonstrated vulnerability to adversarial attacks, more specifically misclassification of adversarial examples. In this paper, we investigate an attack-agnostic defense against adversarial attacks on high-resolution images by detecting suspicious inputs. The intuition behind our approach is that the essential characteristics of a normal image are generally consistent with non-essential style transformations, e.g., slightly changing the facial expression of human portraits. In contrast, adversarial examples are generally sensitive to such transformations. In our approach to detect adversarial instances, we propose an inVertible Autoencoder based on the StyleGAN2 generator via Adversarial training (VASA) to inverse images to disentangled latent codes that reveal hierarchical styles. We then build a set of edited copies with non-essential style transformations by performing latent shifting and reconstruction, based on the correspondences between latent codes and style transformations. The classification-based consistency of these edited copies is used to distinguish adversarial instances.

READ FULL TEXT

page 3

page 9

page 10

page 12

research
02/03/2020

Defending Adversarial Attacks via Semantic Feature Manipulation

Machine learning models have demonstrated vulnerability to adversarial a...
research
02/11/2022

Adversarial Attacks and Defense Methods for Power Quality Recognition

Vulnerability of various machine learning methods to adversarial example...
research
01/06/2020

Generating Semantic Adversarial Examples via Feature Manipulation

The vulnerability of deep neural networks to adversarial attacks has bee...
research
01/27/2021

Detecting Adversarial Examples by Input Transformations, Defense Perturbations, and Voting

Over the last few years, convolutional neural networks (CNNs) have prove...
research
11/20/2019

Fine-grained Synthesis of Unrestricted Adversarial Examples

We propose a novel approach for generating unrestricted adversarial exam...
research
07/10/2019

Metamorphic Detection of Adversarial Examples in Deep Learning Models With Affine Transformations

Adversarial attacks are small, carefully crafted perturbations, impercep...
research
02/01/2019

Natural and Adversarial Error Detection using Invariance to Image Transformations

We propose an approach to distinguish between correct and incorrect imag...

Please sign up or login with your details

Forgot password? Click here to reset