Generating Semantic Adversarial Examples via Feature Manipulation

01/06/2020
by   Shuo Wang, et al.
15

The vulnerability of deep neural networks to adversarial attacks has been widely demonstrated (e.g., adversarial example attacks). Traditional attacks perform unstructured pixel-wise perturbation to fool the classifier. An alternative approach is to have perturbations in the latent space. However, such perturbations are hard to control due to the lack of interpretability and disentanglement. In this paper, we propose a more practical adversarial attack by designing structured perturbation with semantic meanings. Our proposed technique manipulates the semantic attributes of images via the disentangled latent codes. The intuition behind our technique is that images in similar domains have some commonly shared but theme-independent semantic attributes, e.g. thickness of lines in handwritten digits, that can be bidirectionally mapped to disentangled latent codes. We generate adversarial perturbation by manipulating a single or a combination of these latent codes and propose two unsupervised semantic manipulation approaches: vector-based disentangled representation and feature map-based disentangled representation, in terms of the complexity of the latent codes and smoothness of the reconstructed images. We conduct extensive experimental evaluations on real-world image data to demonstrate the power of our attacks for black-box classifiers. We further demonstrate the existence of a universal, image-agnostic semantic adversarial example.

READ FULL TEXT

page 4

page 7

page 8

page 11

page 12

research
02/03/2020

Defending Adversarial Attacks via Semantic Feature Manipulation

Machine learning models have demonstrated vulnerability to adversarial a...
research
06/19/2019

SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing

Deep neural networks (DNNs) have achieved great success in various appli...
research
04/17/2019

Semantic Adversarial Attacks: Parametric Transformations That Fool Deep Classifiers

Deep neural networks have been shown to exhibit an intriguing vulnerabil...
research
06/17/2020

Adversarial Defense by Latent Style Transformations

Machine learning models have demonstrated vulnerability to adversarial a...
research
05/10/2021

Robust Training Using Natural Transformation

Previous robustness approaches for deep learning models such as data aug...
research
09/14/2023

Semantic Adversarial Attacks via Diffusion Models

Traditional adversarial attacks concentrate on manipulating clean exampl...
research
01/18/2021

Generative Counterfactuals for Neural Networks via Attribute-Informed Perturbation

With the wide use of deep neural networks (DNN), model interpretability ...

Please sign up or login with your details

Forgot password? Click here to reset