Self Organizing Nebulous Growths for Robust and Incremental Data Visualization

12/09/2019
by   Damith Senanayake, et al.
0

Non-parametric dimensionality reduction techniques, such as t-SNE and UMAP, are proficient in providing visualizations for fixed or static datasets, but they cannot incrementally map and insert new data points into existing data visualizations. We present Self-Organizing Nebulous Growths (SONG), a parametric nonlinear dimensionality reduction technique that supports incremental data visualization, i.e., incremental addition of new data while preserving the structure of the existing visualization. In addition, SONG is capable of handling new data increments no matter whether they are similar or heterogeneous to the existing observations in distribution. We test SONG on a variety of real and simulated datasets. The results show that SONG is superior to Parametric t-SNE, t-SNE and UMAP in incremental data visualization. Specifically, for heterogeneous increments, SONG improves over Parametric t-SNE by 14.98 regarding the cluster quality measured by the Adjusted Mutual Information scores. On similar or homogeneous increments, the improvements are 8.36 42.26 comparable to UMAP, and superior to t-SNE. We also demonstrate that the algorithmic foundations of SONG render it more tolerant to noise compared to UMAP and t-SNE, thus providing greater utility for data with high variance or high mixing of clusters or noise.

READ FULL TEXT

page 1

page 7

research
07/03/2019

Spectral Overlap and a Comparison of Parameter-Free, Dimensionality Reduction Quality Metrics

Nonlinear dimensionality reduction methods are a popular tool for data s...
research
08/16/2017

Visualizing and Exploring Dynamic High-Dimensional Datasets with LION-tSNE

T-distributed stochastic neighbor embedding (tSNE) is a popular and priz...
research
01/03/2022

Scalable semi-supervised dimensionality reduction with GPU-accelerated EmbedSOM

Dimensionality reduction methods have found vast application as visualiz...
research
06/15/2020

Supervised Visualization for Data Exploration

Dimensionality reduction is often used as an initial step in data explor...
research
03/05/2018

An Analysis of the t-SNE Algorithm for Data Visualization

A first line of attack in exploratory data analysis is data visualizatio...
research
05/10/2019

An Incremental Dimensionality Reduction Method for Visualizing Streaming Multidimensional Data

Dimensionality reduction (DR) methods are commonly used for analyzing an...
research
07/31/2018

t-SNE-CUDA: GPU-Accelerated t-SNE and its Applications to Modern Data

Modern datasets and models are notoriously difficult to explore and anal...

Please sign up or login with your details

Forgot password? Click here to reset