A GPU-Oriented Algorithm Design for Secant-Based Dimensionality Reduction

07/10/2018
by   Henry Kvinge, et al.
0

Dimensionality-reduction techniques are a fundamental tool for extracting useful information from high-dimensional data sets. Because secant sets encode manifold geometry, they are a useful tool for designing meaningful data-reduction algorithms. In one such approach, the goal is to construct a projection that maximally avoids secant directions and hence ensures that distinct data points are not mapped too close together in the reduced space. This type of algorithm is based on a mathematical framework inspired by the constructive proof of Whitney's embedding theorem from differential topology. Computing all (unit) secants for a set of points is by nature computationally expensive, thus opening the door for exploitation of GPU architecture for achieving fast versions of these algorithms. We present a polynomial-time data-reduction algorithm that produces a meaningful low-dimensional representation of a data set by iteratively constructing improved projections within the framework described above. Key to our algorithm design and implementation is the use of GPUs which, among other things, minimizes the computational time required for the calculation of all secant lines. One goal of this report is to share ideas with GPU experts and to discuss a class of mathematical algorithms that may be of interest to the broader GPU community.

READ FULL TEXT
research
08/05/2018

Too many secants: a hierarchical approach to secant-based dimensionality reduction on large data sets

A fundamental question in many data analysis settings is the problem of ...
research
11/13/2021

Efficient Binary Embedding of Categorical Data using BinSketch

In this work, we present a dimensionality reduction algorithm, aka. sket...
research
12/07/2018

Approximate Calculation of Tukey's Depth and Median With High-dimensional Data

We present a new fast approximate algorithm for Tukey (halfspace) depth ...
research
10/31/2018

The Price of Fair PCA: One Extra Dimension

We investigate whether the standard dimensionality reduction technique o...
research
10/27/2018

Monitoring the shape of weather, soundscapes, and dynamical systems: a new statistic for dimension-driven data analysis on large data sets

Dimensionality-reduction methods are a fundamental tool in the analysis ...
research
09/07/2022

Dimensionality Reduction using Elastic Measures

With the recent surge in big data analytics for hyper-dimensional data t...
research
11/26/2019

FCA2VEC: Embedding Techniques for Formal Concept Analysis

Embedding large and high dimensional data into low dimensional vector sp...

Please sign up or login with your details

Forgot password? Click here to reset