Autoencoding beyond pixels using a learned similarity metric

We present an autoencoder that leverages learned representations to better measure similarities in data space. By combining a variational autoencoder with a generative adversarial network we can use learned feature representations in the GAN discriminator as basis for the VAE reconstruction objective. Thereby, we replace element-wise errors with feature-wise errors to better capture the data distribution while offering invariance towards e.g. translation. We apply our method to images of faces and show that it outperforms VAEs with element-wise similarity measures in terms of visual fidelity. Moreover, we show that the method learns an embedding in which high-level abstract visual features (e.g. wearing glasses) can be modified using simple arithmetic.


page 5

page 6


Implicit Discriminator in Variational Autoencoder

Recently generative models have focused on combining the advantages of v...

Earballs: Neural Transmodal Translation

As is expressed in the adage "a picture is worth a thousand words", when...

Improving Variational Autoencoder with Deep Feature Consistent and Generative Adversarial Training

We present a new method for improving the performances of variational au...

Multiscale Metamorphic VAE for 3D Brain MRI Synthesis

Generative modeling of 3D brain MRIs presents difficulties in achieving ...

Text Generation Based on Generative Adversarial Nets with Latent Variable

In this paper, we propose a model using generative adversarial net (GAN)...

Code Repositories


Variational Autoencoder using a similarity metric learned by a generative adversarial network

view repo


Keras / Tensorflow implementation of Larsen,

view repo

Please sign up or login with your details

Forgot password? Click here to reset