Representation of big data by dimension reduction

01/31/2017
by   A. G. Ramm, et al.
0

Suppose the data consist of a set S of points x_j, 1 ≤ j ≤ J, distributed in a bounded domain D ⊂ R^N, where N and J are large numbers. In this paper an algorithm is proposed for checking whether there exists a manifold M of low dimension near which many of the points of S lie and finding such M if it exists. There are many dimension reduction algorithms, both linear and non-linear. Our algorithm is simple to implement and has some advantages compared with the known algorithms. If there is a manifold of low dimension near which most of the data points lie, the proposed algorithm will find it. Some numerical results are presented illustrating the algorithm and analyzing its performance compared to the classical PCA (principal component analysis) and Isomap.

READ FULL TEXT

page 4

page 6

page 8

page 11

page 13

research
08/27/2021

FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component Analysis

Principal Component Analysis (PCA) is a fundamental data preprocessing t...
research
09/24/2022

Fractal dimension, approximation and data sets

The purpose of this paper is to study the fractal phenomena in large dat...
research
10/21/2021

Autonomous Dimension Reduction by Flattening Deformation of Data Manifold under an Intrinsic Deforming Field

A new dimension reduction (DR) method for data sets is proposed by auton...
research
06/22/2016

Manifold Approximation by Moving Least-Squares Projection (MMLS)

In order to avoid the curse of dimensionality, frequently encountered in...
research
01/05/2018

Principal component analysis for big data

Big data is transforming our world, revolutionizing operations and analy...
research
01/06/2021

Smile and Laugh Expressions Detection Based on Local Minimum Key Points

In this paper, a smile and laugh facial expression is presented based on...
research
03/31/2021

Dimension reduction of open-high-low-close data in candlestick chart based on pseudo-PCA

The (open-high-low-close) OHLC data is the most common data form in the ...

Please sign up or login with your details

Forgot password? Click here to reset