Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning

09/27/2020
by   Tim Sainburg, et al.
61

We propose Parametric UMAP, a parametric variation of the UMAP (Uniform Manifold Approximation and Projection) algorithm. UMAP is a non-parametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) Compute a graphical representation of a dataset (fuzzy simplicial complex), and (2) Through stochastic gradient descent, optimize a low-dimensional embedding of the graph. Here, we replace the second step of UMAP with a deep neural network that learns a parametric relationship between data and embedding. We demonstrate that our method performs similarly to its non-parametric counterpart while conferring the benefit of a learned parametric mapping (e.g. fast online embeddings for new data). We then show that UMAP loss can be extended to arbitrary deep learning applications, for example constraining the latent distribution of autoencoders, and improving classifier accuracy for semi-supervised learning by capturing structure in unlabeled data. Our code is available at https://github.com/timsainb/ParametricUMAP_paper.

READ FULL TEXT

page 8

page 11

research
08/21/2019

A Neural Network for Semi-Supervised Learning on Manifolds

Semi-supervised learning algorithms typically construct a weighted graph...
research
04/09/2019

Label Propagation for Deep Semi-supervised Learning

Semi-supervised learning is becoming increasingly important because it c...
research
10/03/2020

Perplexity-free Parametric t-SNE

The t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm is a u...
research
08/14/2020

Supervised Topological Maps

Controlling the internal representation space of a neural network is a d...
research
11/05/2018

Intrinsic Universal Measurements of Non-linear Embeddings

A basic problem in machine learning is to find a mapping f from a low di...
research
09/30/2020

Facilitate the Parametric Dimension Reduction by Gradient Clipping

We extend a well-known dimension reduction method, t-distributed stochas...
research
03/07/2019

Semi-Supervised Non-Parametric Bayesian Modelling of Spatial Proteomics

Understanding sub-cellular protein localisation is an essential componen...

Please sign up or login with your details

Forgot password? Click here to reset