Unexplainable Explanations: Towards Interpreting tSNE and UMAP Embeddings

06/20/2023
by   Andrew Draganov, et al.
0

It has become standard to explain neural network latent spaces with attraction/repulsion dimensionality reduction (ARDR) methods like tSNE and UMAP. This relies on the premise that structure in the 2D representation is consistent with the structure in the model's latent space. However, this is an unproven assumption – we are unaware of any convergence guarantees for ARDR algorithms. We work on closing this question by relating ARDR methods to classical dimensionality reduction techniques. Specifically, we show that one can fully recover a PCA embedding by applying attractions and repulsions onto a randomly initialized dataset. We also show that, with a small change, Locally Linear Embeddings (LLE) can reproduce ARDR embeddings. Finally, we formalize a series of conjectures that, if true, would allow one to attribute structure in the 2D embedding back to the input distribution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2021

On Geodesic Distances and Contextual Embedding Compression for Text Classification

In some memory-constrained settings like IoT devices and over-the-networ...
research
06/13/2022

Fiberwise dimensionality reduction of topologically complex data with vector bundles

Datasets with non-trivial large scale topology can be hard to embed in l...
research
08/25/2022

Supervised Dimensionality Reduction and Image Classification Utilizing Convolutional Autoencoders

The joint optimization of the reconstruction and classification error is...
research
08/11/2017

Simple and Effective Dimensionality Reduction for Word Embeddings

Word embeddings have become the basic building blocks for several natura...
research
09/17/2023

Detecting covariate drift in text data using document embeddings and dimensionality reduction

Detecting covariate drift in text data is essential for maintaining the ...
research
09/22/2020

Stochastic Neighbor Embedding with Gaussian and Student-t Distributions: Tutorial and Survey

Stochastic Neighbor Embedding (SNE) is a manifold learning and dimension...
research
06/20/2022

GiDR-DUN; Gradient Dimensionality Reduction – Differences and Unification

TSNE and UMAP are two of the most popular dimensionality reduction algor...

Please sign up or login with your details

Forgot password? Click here to reset