Visualizing the geometry of labeled high-dimensional data with spheres

07/01/2021
by   Andrew D Zaharia, et al.
0

Data visualizations summarize high-dimensional distributions in two or three dimensions. Dimensionality reduction entails a loss of information, and what is preserved differs between methods. Existing methods preserve the local or the global geometry of the points, and most techniques do not consider labels. Here we introduce "hypersphere2sphere" (H2S), a new method that aims to visualize not the points, but the relationships between the labeled distributions. H2S fits a hypersphere to each labeled set of points in a high-dimensional space and visualizes each hypersphere as a sphere in 3D (or circle in 2D). H2S perfectly captures the geometry of up to 4 hyperspheres in 3D (or 3 in 2D), and approximates the geometry for larger numbers of distributions, matching the sizes (radii), and the pairwise separations (between-center distances) and overlaps (along the center-connection line). The resulting visualizations are robust to sampling imbalances. Leveraging labels and the sphere as the simplest geometrical primitive, H2S provides an important addition to the toolbox of visualization techniques.

READ FULL TEXT

page 9

page 10

page 11

page 12

page 15

page 16

page 19

page 23

research
09/23/2020

Burning sage: Reversing the curse of dimensionality in the visualization of high-dimensional data

In high-dimensional data analysis the curse of dimensionality reasons th...
research
08/03/2021

Visualizing Data using GTSNE

We present a new method GTSNE to visualize high-dimensional data points ...
research
06/15/2020

Supervised Visualization for Data Exploration

Dimensionality reduction is often used as an initial step in data explor...
research
09/04/2019

Theory of high-dimensional outliers

This study concerns the issue of high dimensional outliers which are cha...
research
08/04/2014

A Moving Least Squares Based Approach for Contour Visualization of Multi-Dimensional Data

Analysis of high dimensional data is a common task. Often, small multipl...
research
01/31/2021

Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs

Training artificial neural networks requires the optimization of highly ...
research
03/02/2010

A Unified Algorithmic Framework for Multi-Dimensional Scaling

In this paper, we propose a unified algorithmic framework for solving ma...

Please sign up or login with your details

Forgot password? Click here to reset