Robustness of Bayesian Neural Networks to Gradient-Based Attacks

by   Ginevra Carbone, et al.
University of Oxford

Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, the problem remains open. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparametrized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in the limit BNN posteriors are robust to gradient-based adversarial attacks. Experimental results on the MNIST and Fashion MNIST datasets with BNNs trained with Hamiltonian Monte Carlo and Variational Inference support this line of argument, showing that BNNs can display both high accuracy and robustness to gradient based adversarial attacks.


page 5

page 7

page 8


On the Robustness of Bayesian Neural Networks to Adversarial Attacks

Vulnerability to adversarial attacks is one of the principal hurdles to ...

The Effect of Prior Lipschitz Continuity on the Adversarial Robustness of Bayesian Neural Networks

It is desirable, and often a necessity, for machine learning models to b...

Combinatorial Attacks on Binarized Neural Networks

Binarized Neural Networks (BNNs) have recently attracted significant int...

Deep-RBF Networks Revisited: Robust Classification with Rejection

One of the main drawbacks of deep neural networks, like many other class...

Sampled Nonlocal Gradients for Stronger Adversarial Attacks

The vulnerability of deep neural networks to small and even imperceptibl...

Improved Gradient based Adversarial Attacks for Quantized Networks

Neural network quantization has become increasingly popular due to effic...

Gradient Similarity: An Explainable Approach to Detect Adversarial Attacks against Deep Learning

Deep neural networks are susceptible to small-but-specific adversarial p...

Please sign up or login with your details

Forgot password? Click here to reset