Defending Variational Autoencoders from Adversarial Attacks with MCMC

03/18/2022
by   Anna Kuzina, et al.
0

Variational autoencoders (VAEs) are deep generative models used in various domains. VAEs can generate complex objects and provide meaningful latent representations, which can be further used in downstream tasks such as classification. As previous work has shown, one can easily fool VAEs to produce unexpected latent representations and reconstructions for a visually slightly modified input. Here, we examine several objective functions for adversarial attacks construction, suggest metrics assess the model robustness, and propose a solution to alleviate the effect of an attack. Our method utilizes the Markov Chain Monte Carlo (MCMC) technique in the inference step and is motivated by our theoretical analysis. Thus, we do not incorporate any additional costs during training or we do not decrease the performance on non-attacked inputs. We validate our approach on a variety of datasets (MNIST, Fashion MNIST, Color MNIST, CelebA) and VAE configurations (β-VAE, NVAE, TC-VAE) and show that it consistently improves the model robustness to adversarial attacks.

READ FULL TEXT

page 2

page 9

page 16

page 18

research
03/10/2021

Diagnosing Vulnerability of Variational Auto-Encoders to Adversarial Attacks

In this work, we explore adversarial attacks on the Variational Autoenco...
research
07/14/2020

Towards a Theoretical Understanding of the Robustness of Variational Autoencoders

We make inroads into understanding the robustness of Variational Autoenc...
research
06/12/2018

Adversarial Attacks on Variational Autoencoders

Adversarial attacks are malicious inputs that derail machine-learning mo...
research
02/07/2020

Assessing the Adversarial Robustness of Monte Carlo and Distillation Methods for Deep Bayesian Neural Network Classification

In this paper, we consider the problem of assessing the adversarial robu...
research
10/14/2021

The Neglected Sibling: Isotropic Gaussian Posterior for VAE

Deep generative models have been widely used in several areas of NLP, an...
research
12/01/2016

Adversarial Images for Variational Autoencoders

We investigate adversarial attacks for autoencoders. We propose a proced...

Please sign up or login with your details

Forgot password? Click here to reset