Principal Graphs and Manifolds

09/02/2008
by   Alexander N. Gorban, et al.
0

In many physical, statistical, biological and other investigations it is desirable to approximate a system of points by objects of lower dimension and/or complexity. For this purpose, Karl Pearson invented principal component analysis in 1901 and found 'lines and planes of closest fit to system of points'. The famous k-means algorithm solves the approximation problem too, but by finite sets instead of lines and planes. This chapter gives a brief practical introduction into the methods of construction of general principal objects, i.e. objects embedded in the 'middle' of the multidimensional data set. As a basis, the unifying framework of mean squared distance approximation of finite datasets is selected. Principal graphs and manifolds are constructed as generalisations of principal components and k-means principal points. For this purpose, the family of expectation/maximisation algorithms with nearest generalisations is presented. Construction of principal graphs with controlled complexity is based on the graph grammar approach.

READ FULL TEXT
research
03/22/2006

Topological Grammars for Data Approximation

A method of topological grammars is proposed for multidimensional data ...
research
01/07/2010

Principal manifolds and graphs in practice: from molecular biology to dynamical systems

We present several applications of non-linear data modeling, using princ...
research
04/20/2018

Robust and scalable learning of data manifolds with complex topologies via ElPiGraph

We present ElPiGraph, a method for approximating data distributions havi...
research
02/11/2013

Geometrical complexity of data approximators

There are many methods developed to approximate a cloud of vectors embed...
research
07/28/2023

Stratified Principal Component Analysis

This paper investigates a general family of models that stratifies the s...
research
06/07/2020

Principal points and elliptical distributions from the multivariate setting to the functional case

The k principal points of a random vector 𝐗 are defined as a set of poin...
research
06/02/2020

Robust multivariate methods in Chemometrics

This chapter presents an introduction to robust statistics with applicat...

Please sign up or login with your details

Forgot password? Click here to reset