Random Projections for Improved Adversarial Robustness

by   Ginevra Carbone, et al.

We propose two training techniques for improving the robustness of Neural Networks to adversarial attacks, i.e. manipulations of the inputs that are maliciously crafted to fool networks into incorrect predictions. Both methods are independent of the chosen attack and leverage random projections of the original inputs, with the purpose of exploiting both dimensionality reduction and some characteristic geometrical properties of adversarial perturbations. The first technique is called RP-Ensemble and consists of an ensemble of networks trained on multiple projected versions of the original inputs. The second one, named RP-Regularizer, adds instead a regularization term to the training objective.


Revisiting DeepFool: generalization and improvement

Deep neural networks have been known to be vulnerable to adversarial exa...

Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks

It has been widely recognized that adversarial examples can be easily cr...

Understanding the Logit Distributions of Adversarially-Trained Deep Neural Networks

Adversarial defenses train deep neural networks to be invariant to the i...

Repairing Adversarial Texts through Perturbation

It is known that neural networks are subject to attacks through adversar...

Generating Realistic Unrestricted Adversarial Inputs using Dual-Objective GAN Training

The correctness of deep neural networks is well-known to be vulnerable t...

Adversarial Attacks against Neural Networks in Audio Domain: Exploiting Principal Components

Adversarial attacks are inputs that are similar to original inputs but a...

Addressing Mistake Severity in Neural Networks with Semantic Knowledge

Robustness in deep neural networks and machine learning algorithms in ge...

Please sign up or login with your details

Forgot password? Click here to reset