Information-theoretic Interestingness Measures for Cross-Ontology Data Mining

04/29/2015
by   Prashanti Manda, et al.
0

Community annotation of biological entities with concepts from multiple bio-ontologies has created large and growing repositories of ontology-based annotation data with embedded implicit relationships among orthogonal ontologies. Development of efficient data mining methods and metrics to mine and assess the quality of the mined relationships has not kept pace with the growth of annotation data. In this study, we present a data mining method that uses ontology-guided generalization to discover relationships across ontologies along with a new interestingness metric based on information theory. We apply our data mining algorithm and interestingness measures to datasets from the Gene Expression Database at the Mouse Genome Informatics as a preliminary proof of concept to mine relationships between developmental stages in the mouse anatomy ontology and Gene Ontology concepts (biological process, molecular function and cellular component). In addition, we present a comparison of our interestingness metric to four existing metrics. Ontology-based annotation datasets provide a valuable resource for discovery of relationships across ontologies. The use of efficient data mining methods and appropriate interestingness metrics enables the identification of high quality relationships.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2018

OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction

Motivation: Ontologies are widely used in biology for data annotation, i...
research
12/27/2017

Enumerating consistent subgraphs of directed acyclic graphs: an insight into biomedical ontologies

Modern problems of concept annotation associate an object of interest (g...
research
11/20/2020

OAK: Ontology-Based Knowledge Map Model for Digital Agriculture

Nowadays, a huge amount of knowledge has been amassed in digital agricul...
research
04/07/2021

TestTDO's v1.2 Terms, Properties, Relationships and Axioms – A Top-Domain Software Testing Ontology

The present preprint specifies and defines all Terms, Properties, Relati...
research
08/17/2017

Human Uncertainty and Ranking Error -- The Secret of Successful Evaluation in Predictive Data Mining

One of the most crucial issues in data mining is to model human behaviou...
research
05/28/2009

Mining Generalized Patterns from Large Databases using Ontologies

Formal Concept Analysis (FCA) is a mathematical theory based on the form...

Please sign up or login with your details

Forgot password? Click here to reset