Linearly-scalable learning of smooth low-dimensional patterns with permutation-aided entropic dimension reduction

06/17/2023
by   Illia Horenko, et al.
0

In many data science applications, the objective is to extract appropriately-ordered smooth low-dimensional data patterns from high-dimensional data sets. This is challenging since common sorting algorithms are primarily aiming at finding monotonic orderings in low-dimensional data, whereas typical dimension reduction and feature extraction algorithms are not primarily designed for extracting smooth low-dimensional data patterns. We show that when selecting the Euclidean smoothness as a pattern quality criterium, both of these problems (finding the optimal 'crisp' data permutation and extracting the sparse set of permuted low-dimensional smooth patterns) can be efficiently solved numerically as one unsupervised entropy-regularized iterative optimization problem. We formulate and prove the conditions for monotonicity and convergence of this linearly-scalable (in dimension) numerical procedure, with the iteration cost scaling of 𝒪(DT^2), where T is the size of the data statistics and D is a feature space dimension. The efficacy of the proposed method is demonstrated through the examination of synthetic examples as well as a real-world application involving the identification of smooth bankruptcy risk minimizing transition patterns from high-dimensional economical data. The results showcase that the statistical properties of the overall time complexity of the method exhibit linear scaling in the dimensionality D within the specified confidence intervals.

READ FULL TEXT
research
03/11/2021

Modern Dimension Reduction

Data are not only ubiquitous in society, but are increasingly complex bo...
research
06/23/2023

On the Convergence Rate of Gaussianization with Random Rotations

Gaussianization is a simple generative model that can be trained without...
research
06/24/2009

On landmark selection and sampling in high-dimensional data analysis

In recent years, the spectral analysis of appropriately defined kernel m...
research
03/27/2018

Fast Computation of Robust Subspace Estimators

Dimension reduction is often an important step in the analysis of high-d...
research
09/29/2019

Capacity Preserving Mapping for High-dimensional Data Visualization

We provide a rigorous mathematical treatment to the crowding issue in da...
research
02/19/2018

Entropy-Isomap: Manifold Learning for High-dimensional Dynamic Processes

Scientific and engineering processes produce massive high-dimensional da...
research
10/15/2018

Optimally rotated coordinate systems for adaptive least-squares regression on sparse grids

For low-dimensional data sets with a large amount of data points, standa...

Please sign up or login with your details

Forgot password? Click here to reset