Adversarial Defense based on Structure-to-Signal Autoencoders

03/21/2018
by   Joachim Folz, et al.
0

Adversarial attack methods have demonstrated the fragility of deep neural networks. Their imperceptible perturbations are frequently able fool classifiers into potentially dangerous misclassifications. We propose a novel way to interpret adversarial perturbations in terms of the effective input signal that classifiers actually use. Based on this, we apply specially trained autoencoders, referred to as S2SNets, as defense mechanism. They follow a two-stage training scheme: first unsupervised, followed by a fine-tuning of the decoder, using gradients from an existing classifier. S2SNets induce a shift in the distribution of gradients propagated through them, stripping them from class-dependent signal. We analyze their robustness against several white-box and gray-box scenarios on the large ImageNet dataset. Our approach reaches comparable resilience in white-box attack scenarios as other state-of-the-art defenses in gray-box scenarios. We further analyze the relationships of AlexNet, VGG 16, ResNet 50 and Inception v3 in adversarial space, and found that VGG 16 is the easiest to fool, while perturbations from ResNet 50 are the most transferable.

READ FULL TEXT

page 8

page 19

page 20

page 21

page 22

page 23

research
04/01/2019

Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks

Deep neural networks are vulnerable to adversarial attacks, which can fo...
research
12/23/2021

Adaptive Modeling Against Adversarial Attacks

Adversarial training, the process of training a deep learning model with...
research
07/19/2020

Exploiting vulnerabilities of deep neural networks for privacy protection

Adversarial perturbations can be added to images to protect their conten...
research
10/08/2019

SmoothFool: An Efficient Framework for Computing Smooth Adversarial Perturbations

Deep neural networks are susceptible to adversarial manipulations in the...
research
01/12/2022

Adversarially Robust Classification by Conditional Generative Model Inversion

Most adversarial attack defense methods rely on obfuscating gradients. T...
research
07/11/2019

Why Blocking Targeted Adversarial Perturbations Impairs the Ability to Learn

Despite their accuracy, neural network-based classifiers are still prone...
research
07/30/2019

2D and 3D Segmentation of uncertain local collagen fiber orientations in SHG microscopy

Collagen fiber orientations in bones, visible with Second Harmonic Gener...

Please sign up or login with your details

Forgot password? Click here to reset