Semantic distillation: a method for clustering objects by their contextual specificity

10/05/2007
by   Thomas Sierocinski, et al.
0

Techniques for data-mining, latent semantic analysis, contextual search of databases, etc. have long ago been developed by computer scientists working on information retrieval (IR). Experimental scientists, from all disciplines, having to analyse large collections of raw experimental data (astronomical, physical, biological, etc.) have developed powerful methods for their statistical analysis and for clustering, categorising, and classifying objects. Finally, physicists have developed a theory of quantum measurement, unifying the logical, algebraic, and probabilistic aspects of queries into a single formalism. The purpose of this paper is twofold: first to show that when formulated at an abstract level, problems from IR, from statistical data analysis, and from physical measurement theories are very similar and hence can profitably be cross-fertilised, and, secondly, to propose a novel method of fuzzy hierarchical clustering, termed semantic distillation -- strongly inspired from the theory of quantum measurement --, we developed to analyse raw data coming from various types of experiments on DNA arrays. We illustrate the method by analysing DNA arrays experiments and clustering the genes of the array according to their specificity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2020

A Survey of Quantum Theory Inspired Approaches to Information Retrieval

Since 2004, researchers have been using the mathematical framework of Qu...
research
09/15/2018

Commentary on Quantum-Inspired Information Retrieval

There have been suggestions within the Information Retrieval (IR) commun...
research
04/17/2011

Quantum Structure in Cognition: Fundamentals and Applications

Experiments in cognitive science and decision theory show that the ways ...
research
01/26/2021

Adaptive Neuro Fuzzy Networks based on Quantum Subtractive Clustering

Data mining techniques can be used to discover useful patterns by explor...
research
05/12/2017

Proof Mining with Dependent Types

Several approaches exist to data-mining big corpora of formal proofs. So...
research
08/17/2023

Approximating Clustering for Memory Management and request processing

Clustering is a crucial tool for analyzing data in virtually every scien...
research
07/11/2012

Applying Discrete PCA in Data Analysis

Methods for analysis of principal components in discrete data have exist...

Please sign up or login with your details

Forgot password? Click here to reset