Symmetry in Data Mining and Analysis: A Unifying View based on Hierarchy

05/18/2008
by   Fionn Murtagh, et al.
0

Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. The data sets themselves are explicitly linked as a form of representation to an observational or otherwise empirical domain of interest. "Structure" has long been understood as symmetry which can take many forms with respect to any transformation, including point, translational, rotational, and many others. Beginning with the role of number theory in expressing data, we show how we can naturally proceed to hierarchical structures. We show how this both encapsulates traditional paradigms in data analysis, and also opens up new perspectives towards issues that are on the order of the day, including data mining of massive, high dimensional, heterogeneous data sets. Linkages with other fields are also discussed including computational logic and symbolic dynamics. The structures in data surveyed here are based on hierarchy, represented as p-adic numbers or an ultrametric topology.

READ FULL TEXT
research
05/14/2010

Hierarchical Clustering for Finding Symmetries and Other Patterns in Massive, High Dimensional Datasets

Data analysis and data mining are concerned with unsupervised pattern fi...
research
10/19/2010

Mining Knowledge in Astrophysical Massive Data Sets

Modern scientific data mainly consist of huge datasets gathered by a ver...
research
03/07/2023

Toward NeuroDM: Where Computational Neuroscience Meets Data Mining

At the intersection of computational neuroscience (CN) and data mining (...
research
11/08/2021

A Novel Data Pre-processing Technique: Making Data Mining Robust to Different Units and Scales of Measurement

Many existing data mining algorithms use feature values directly in thei...
research
02/04/2019

Distances between Data Sets Based on Summary Statistics

The concepts of similarity and distance are crucial in data mining. We c...
research
04/22/2002

Sampling Strategies for Mining in Data-Scarce Domains

Data mining has traditionally focused on the task of drawing inferences ...
research
11/22/2019

Unsupervised Features Learning for Sampled Vector Fields

In this paper we introduce a new approach to computing hidden features o...

Please sign up or login with your details

Forgot password? Click here to reset