Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations

04/07/2018
by   Alex Lamb, et al.
0

Deep networks have achieved impressive results across a variety of important tasks. However a known weakness is a failure to perform well when evaluated on data which differ from the training distribution, even if these differences are very small, as is the case with adversarial examples. We propose Fortified Networks, a simple transformation of existing networks, which fortifies the hidden layers in a deep network by identifying when the hidden states are off of the data manifold, and maps these hidden states back to parts of the data manifold where the network performs well. Our principal contribution is to show that fortifying these hidden states improves the robustness of deep networks and our experiments (i) demonstrate improved robustness to standard adversarial attacks in both black-box and white-box threat models; (ii) suggest that our improvements are not primarily due to the gradient masking problem and (iii) show the advantage of doing this fortification in the hidden layers instead of the input space.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2018

Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer

Deep networks often perform well on the data manifold on which they are ...
research
05/26/2019

State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations

Machine learning promises methods that generalize well from finite label...
research
07/14/2020

Multitask Learning Strengthens Adversarial Robustness

Although deep networks achieve strong accuracy on a range of computer vi...
research
07/04/2020

Relationship between manifold smoothness and adversarial vulnerability in deep learning with local errors

Artificial neural networks can achieve impressive performances, and even...
research
07/05/2018

Explainable Learning: Implicit Generative Modelling during Training for Adversarial Robustness

We introduce Explainable Learning ,ExL, an approach for training neural ...
research
10/10/2019

Coloring the Black Box: Visualizing neural network behavior with a self-introspective model

The following work presents how autoencoding all the possible hidden act...
research
03/04/2021

Hard-label Manifolds: Unexpected Advantages of Query Efficiency for Finding On-manifold Adversarial Examples

Designing deep networks robust to adversarial examples remains an open p...

Please sign up or login with your details

Forgot password? Click here to reset