Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic Approach to Manifold Dimension Estimation

07/08/2021
by   Alexander Ivanov, et al.
0

Manifold hypothesis states that data points in high-dimensional space actually lie in close vicinity of a manifold of much lower dimension. In many cases this hypothesis was empirically verified and used to enhance unsupervised and semi-supervised learning. Here we present new approach to manifold hypothesis checking and underlying manifold dimension estimation. In order to do it we use two very different methods simultaneously - one geometric, another probabilistic - and check whether they give the same result. Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation. The probabilistic method is new. Although it exploits standard nearest neighborhood distance, it is different from methods which were previously used in such situations. This method is robust, fast and includes special preliminary data transformation. Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2021

Manifold Repairing, Reconstruction and Denoising from Scattered Data in High-Dimension

We consider a problem of great practical interest: the repairing and rec...
research
07/01/2018

Heuristic Framework for Multi-Scale Testing of the Multi-Manifold Hypothesis

When analyzing empirical data, we often find that global linear models o...
research
01/02/2023

Estimating Distributions with Low-dimensional Structures Using Mixtures of Generative Models

There has been a growing interest in statistical inference from data sat...
research
09/19/2018

Aligning Manifolds of Double Pendulum Dynamics Under the Influence of Noise

This study presents the results of a series of simulation experiments th...
research
06/22/2016

Manifold Approximation by Moving Least-Squares Projection (MMLS)

In order to avoid the curse of dimensionality, frequently encountered in...
research
07/27/2020

Normal-bundle Bootstrap

Probabilistic models of data sets often exhibit salient geometric struct...
research
01/06/2022

Contrastive Neighborhood Alignment

We present Contrastive Neighborhood Alignment (CNA), a manifold learning...

Please sign up or login with your details

Forgot password? Click here to reset