Adversarial Phenomenon in the Eyes of Bayesian Deep Learning

11/22/2017
by   Ambrish Rawat, et al.
0

Deep Learning models are vulnerable to adversarial examples, i.e. images obtained via deliberate imperceptible perturbations, such that the model misclassifies them with high confidence. However, class confidence by itself is an incomplete picture of uncertainty. We therefore use principled Bayesian methods to capture model uncertainty in prediction for observing adversarial misclassification. We provide an extensive study with different Bayesian neural networks attacked in both white-box and black-box setups. The behaviour of the networks for noise, attacks and clean test data is compared. We observe that Bayesian neural networks are uncertain in their predictions for adversarial perturbations, a behaviour similar to the one observed for random Gaussian perturbations. Thus, we conclude that Bayesian neural networks can be considered for detecting adversarial examples.

READ FULL TEXT
research
12/06/2018

The Limitations of Model Uncertainty in Adversarial Settings

Machine learning models are vulnerable to adversarial examples: minor pe...
research
07/30/2021

Who's Afraid of Thomas Bayes?

In many cases, neural networks perform well on test data, but tend to ov...
research
09/03/2020

Ramifications of Approximate Posterior Inference for Bayesian Deep Learning in Adversarial and Out-of-Distribution Settings

Deep neural networks have been successful in diverse discriminative clas...
research
11/08/2017

Intriguing Properties of Adversarial Examples

It is becoming increasingly clear that many machine learning classifiers...
research
11/10/2020

Efficient and Transferable Adversarial Examples from Bayesian Neural Networks

Deep neural networks are vulnerable to evasion attacks, i.e., carefully ...
research
04/17/2019

Interpreting Adversarial Examples with Attributes

Deep computer vision systems being vulnerable to imperceptible and caref...
research
06/17/2021

Localized Uncertainty Attacks

The susceptibility of deep learning models to adversarial perturbations ...

Please sign up or login with your details

Forgot password? Click here to reset