Tree-SNE: Hierarchical Clustering and Visualization Using t-SNE

02/13/2020
by   Isaac Robinson, et al.
0

t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in speeding up t-SNE and obtaining finer-grained structure, we combine the two to create tree-SNE, a hierarchical clustering and visualization algorithm based on stacked one-dimensional t-SNE embeddings. We also introduce alpha-clustering, which recommends the optimal cluster assignment, without foreknowledge of the number of clusters, based off of the cluster stability across multiple scales. We demonstrate the effectiveness of tree-SNE and alpha-clustering on images of handwritten digits, mass cytometry (CyTOF) data from blood cells, and single-cell RNA-sequencing (scRNA-seq) data from retinal cells. Furthermore, to demonstrate the validity of the visualization, we use alpha-clustering to obtain unsupervised clustering results competitive with the state of the art on several image data sets. Software is available at https://github.com/isaacrob/treesne.

READ FULL TEXT

page 2

page 4

page 6

page 7

page 8

page 14

page 15

page 17

research
02/11/2021

Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder

Single cell RNA sequencing (scRNA-seq) data makes studying the developme...
research
09/21/2020

Interactive Steering of Hierarchical Clustering

Hierarchical clustering is an important technique to organize big data f...
research
02/27/2019

Linear Time Visualization and Search in Big Data using Pixellated Factor Space Mapping

It is demonstrated how linear computational time and storage efficient a...
research
06/11/2020

Interpretable Visualizations with Differentiating Embedding Networks

We present a visualization algorithm based on a novel unsupervised Siame...
research
09/28/2020

Hierarchical correction of p-values via a tree running Ornstein-Uhlenbeck process

Statistical testing is classically used as an exploratory tool to search...
research
06/03/2022

Interactive Exploration of Large Dendrograms with Prototypes

Hierarchical clustering is one of the standard methods taught for identi...
research
01/26/2018

Information Content of a Phylogenetic Tree in a Data Matrix

Phylogenetic trees in genetics and biology in general are all binary. We...

Please sign up or login with your details

Forgot password? Click here to reset