Genetic Programming for Manifold Learning: Preserving Local Topology

08/23/2021
by   Andrew Lensen, et al.
0

Manifold learning methods are an invaluable tool in today's world of increasingly huge datasets. Manifold learning algorithms can discover a much lower-dimensional representation (embedding) of a high-dimensional dataset through non-linear transformations that preserve the most important structure of the original data. State-of-the-art manifold learning methods directly optimise an embedding without mapping between the original space and the discovered embedded space. This makes interpretability - a key requirement in exploratory data analysis - nearly impossible. Recently, genetic programming has emerged as a very promising approach to manifold learning by evolving functional mappings from the original space to an embedding. However, genetic programming-based manifold learning has struggled to match the performance of other approaches. In this work, we propose a new approach to using genetic programming for manifold learning, which preserves local topology. This is expected to significantly improve performance on tasks where local neighbourhood structure (topology) is paramount. We compare our proposed approach with various baseline manifold learning methods and find that it often outperforms other methods, including a clear improvement over previous genetic programming approaches. These results are particularly promising, given the potential interpretability and reusability of the evolved mappings.

READ FULL TEXT

page 1

page 12

page 15

research
01/05/2020

Multi-Objective Genetic Programming for Manifold Learning: Balancing Quality and Dimensionality

Manifold learning techniques have become increasingly valuable as data c...
research
02/08/2019

Can Genetic Programming Do Manifold Learning Too?

Exploratory data analysis is a fundamental aspect of knowledge discovery...
research
01/27/2020

Genetic Programming for Evolving a Front of Interpretable Models for Data Visualisation

Data visualisation is a key tool in data mining for understanding big da...
research
10/22/2019

Genetic Programming for Evolving Similarity Functions for Clustering: Representations and Analysis

Clustering is a difficult and widely-studied data mining task, with many...
research
06/26/2019

No Pressure! Addressing the Problem of Local Minima in Manifold Learning Algorithms

Nonlinear embedding manifold learning methods provide invaluable visual ...
research
03/05/2023

CAMEL: Curvature-Augmented Manifold Embedding and Learning

A novel method, named Curvature-Augmented Manifold Embedding and Learnin...
research
06/17/2019

rna2rna: Predicting lncRNA-microRNA-mRNA Interactions from Sequence with Integration of Interactome and Biological Annotation Data

Long non-coding RNA, microRNA, and messenger RNA enable key regulations ...

Please sign up or login with your details

Forgot password? Click here to reset