Log In Sign Up

Data Generation in Low Sample Size Setting Using Manifold Sampling and a Geometry-Aware VAE

by   Clément Chadebec, et al.

While much efforts have been focused on improving Variational Autoencoders through richer posterior and prior distributions, little interest was shown in amending the way we generate the data. In this paper, we develop two non prior-dependent generation procedures based on the geometry of the latent space seen as a Riemannian manifold. The first one consists in sampling along geodesic paths which is a natural way to explore the latent space while the second one consists in sampling from the inverse of the metric volume element which is easier to use in practice. Both methods are then compared to prior-based methods on various data sets and appear well suited for a limited data regime. Finally, the latter method is used to perform data augmentation in a small sample size setting and is validated across various standard and real-life data sets. In particular, this scheme allows to greatly improve classification results on the OASIS database where balanced accuracy jumps from 80.7 when trained only with the synthetic data generated by our method. Such results were also observed on 4 standard data sets.


page 4

page 6

page 7

page 16


Data Augmentation in High Dimensional Low Sample Size Setting Using a Geometry-Based Variational Autoencoder

In this paper, we propose a new method to perform data augmentation in a...

Geometry-Aware Hamiltonian Variational Auto-Encoder

Variational auto-encoders (VAEs) have proven to be a well suited tool fo...

A Geometric Perspective on Variational Autoencoders

This paper introduces a new interpretation of the Variational Autoencode...

A prior-based approximate latent Riemannian metric

Stochastic generative models enable us to capture the geometric structur...

Exemplar VAEs for Exemplar based Generation and Data Augmentation

This paper presents a framework for exemplar based generative modeling, ...

AriEL: volume coding for sentence generation

Mapping sequences of discrete data to a point in a continuous space make...