In search of the most efficient and memory-saving visualization of high dimensional data

02/27/2023
by   Bartosz Minch, et al.
0

Interactive exploration of large, multidimensional datasets plays a very important role in various scientific fields. It makes it possible not only to identify important structural features and forms, such as clusters of vertices and their connection patterns, but also to evaluate their interrelationships in terms of position, distance, shape and connection density. We argue that the visualization of multidimensional data is well approximated by the problem of two-dimensional embedding of undirected nearest-neighbor graphs. The size of complex networks is a major challenge for today's computer systems and still requires more efficient data embedding algorithms. Existing reduction methods are too slow and do not allow interactive manipulation. We show that high-quality embeddings are produced with minimal time and memory complexity. We present very efficient IVHD algorithms (CPU and GPU) and compare them with the latest and most popular dimensionality reduction methods. We show that the memory and time requirements are dramatically lower than for base codes. At the cost of a slight degradation in embedding quality, IVHD preserves the main structural properties of the data well with a much lower time budget. We also present a meta-algorithm that allows the use of any unsupervised data embedding method in a supervised manner.

READ FULL TEXT
research
02/04/2019

2-D Embedding of Large and High-dimensional Data with Minimal Memory and Computational Time Requirements

In the advent of big data era, interactive visualization of large data s...
research
03/24/2022

Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction

Dimensionality reduction is crucial both for visualization and preproces...
research
01/03/2022

Scalable semi-supervised dimensionality reduction with GPU-accelerated EmbedSOM

Dimensionality reduction methods have found vast application as visualiz...
research
10/15/2021

SGEN: Single-cell Sequencing Graph Self-supervised Embedding Network

Single-cell sequencing has a significant role to explore biological proc...
research
12/25/2017

Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding

t-distributed Stochastic Neighborhood Embedding (t-SNE) is a method for ...
research
02/19/2017

Compressive Embedding and Visualization using Graphs

Visualizing high-dimensional data has been a focus in data analysis comm...
research
08/16/2017

Visualizing and Exploring Dynamic High-Dimensional Datasets with LION-tSNE

T-distributed stochastic neighbor embedding (tSNE) is a popular and priz...

Please sign up or login with your details

Forgot password? Click here to reset