Adversarial Manipulation of Deep Representations

by   Sara Sabour, et al.

We show that the representation of an image in a deep neural network (DNN) can be manipulated to mimic those of other natural images, with only minor, imperceptible perturbations to the original image. Previous methods for generating adversarial images focused on image perturbations designed to produce erroneous class labels, while we concentrate on the internal layers of DNN representations. In this way our new class of adversarial images differs qualitatively from others. While the adversary is perceptually similar to one image, its internal representation appears remarkably similar to a different image, one from a different class, bearing little if any apparent similarity to the input; they appear generic and consistent with the space of natural images. This phenomenon raises questions about DNN representations, as well as the properties of natural images themselves.


page 3

page 4

page 12

page 14

page 15

page 16

page 17

page 18


Semantic Adversarial Examples

Deep neural networks are known to be vulnerable to adversarial examples,...

Aesthetics of Neural Network Art

This paper proposes a way to understand neural network artworks as juxta...

Understanding invariance via feedforward inversion of discriminatively trained classifiers

A discriminatively trained neural net classifier achieves optimal perfor...

Why are images smooth?

It is a well observed phenomenon that natural images are smooth, in the ...

Disentangling visual and written concepts in CLIP

The CLIP network measures the similarity between natural text and images...

A New Defense Against Adversarial Images: Turning a Weakness into a Strength

Natural images are virtually surrounded by low-density misclassified reg...

Adversarial Defense Through Network Profiling Based Path Extraction

Recently, researchers have started decomposing deep neural network model...

Code Repositories


Adversarial Manipulation of Deep Representations

view repo