BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders

03/06/2020
by   Kaspar Märtens, et al.
0

Variational Autoencoders (VAEs) provide a flexible and scalable framework for non-linear dimensionality reduction. However, in application domains such as genomics where data sets are typically tabular and high-dimensional, a black-box approach to dimensionality reduction does not provide sufficient insights. Common data analysis workflows additionally use clustering techniques to identify groups of similar features. This usually leads to a two-stage process, however, it would be desirable to construct a joint modelling framework for simultaneous dimensionality reduction and clustering of features. In this paper, we propose to achieve this through the BasisVAE: a combination of the VAE and a probabilistic clustering prior, which lets us learn a one-hot basis function representation as part of the decoder network. Furthermore, for scenarios where not all features are aligned, we develop an extension to handle translation-invariant basis functions. We show how a collapsed variational inference scheme leads to scalable and efficient inference for BasisVAE, demonstrated on various toy examples as well as on single-cell gene expression data.

READ FULL TEXT

page 2

page 6

page 8

research
11/03/2015

PCA-Based Out-of-Sample Extension for Dimensionality Reduction

Dimensionality reduction methods are very common in the field of high di...
research
06/25/2020

Neural Decomposition: Functional ANOVA with Variational Autoencoders

Variational Autoencoders (VAEs) have become a popular approach for dimen...
research
08/23/2017

Variational autoencoders for tissue heterogeneity exploration from (almost) no preprocessed mass spectrometry imaging data

The paper presents the application of Variational Autoencoders (VAE) for...
research
09/14/2022

Modelling Technical and Biological Effects in scRNA-seq data with Scalable GPLVMs

Single-cell RNA-seq datasets are growing in size and complexity, enablin...
research
04/28/2022

Representative period selection for power system planning using autoencoder-based dimensionality reduction

Power sector capacity expansion models (CEMs) that are used for studying...
research
12/22/2019

Interpretable Embeddings From Molecular Simulations Using Gaussian Mixture Variational Autoencoders

Extracting insight from the enormous quantity of data generated from mol...
research
06/27/2023

Enhancing Representation Learning on High-Dimensional, Small-Size Tabular Data: A Divide and Conquer Method with Ensembled VAEs

Variational Autoencoders and their many variants have displayed impressi...

Please sign up or login with your details

Forgot password? Click here to reset