OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction

04/29/2018
by   Fatima Zohra Smaili, et al.
0

Motivation: Ontologies are widely used in biology for data annotation, integration, and analysis. In addition to formally structured axioms, ontologies contain meta-data in the form of annotation axioms which provide valuable pieces of information that characterize ontology classes. Annotations commonly used in ontologies include class labels, descriptions, or synonyms. Despite being a rich source of semantic information, the ontology meta-data are generally unexploited by ontology-based analysis methods such as semantic similarity measures. Results: We propose a novel method, OPA2Vec, to generate vector representations of biological entities in ontologies by combining formal ontology axioms and annotation axioms from the ontology meta-data. We apply a Word2Vec model that has been pre-trained on PubMed abstracts to produce feature vectors from our collected data. We validate our method in two different ways: first, we use the obtained vector representations of proteins as a similarity measure to predict protein-protein interaction (PPI) on two different datasets. Second, we evaluate our method on predicting gene-disease associations based on phenotype similarity by generating vector representations of genes and diseases using a phenotype ontology, and applying the obtained vectors to predict gene-disease associations. These two experiments are just an illustration of the possible applications of our method. OPA2Vec can be used to produce vector representations of any biomedical entity given any type of biomedical ontology. Availability: https://github.com/bio-ontology-research-group/opa2vec Contact: robert.hoehndorf@kaust.edu.sa and xin.gao@kaust.edu.sa.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2021

Predicting Gene-Disease Associations with Knowledge Graph Embeddings over Multiple Ontologies

Ontology-based approaches for predicting gene-disease associations inclu...
research
01/31/2018

Onto2Vec: joint vector-based representation of biological entities and their ontology-based annotations

We propose the Onto2Vec method, an approach to learn feature vectors for...
research
04/29/2015

Information-theoretic Interestingness Measures for Cross-Ontology Data Mining

Community annotation of biological entities with concepts from multiple ...
research
08/02/2019

OntoPlot: A Novel Visualisation for Non-hierarchical Associations in Large Ontologies

Ontologies are formal representations of concepts and complex relationsh...
research
12/27/2017

Enumerating consistent subgraphs of directed acyclic graphs: an insight into biomedical ontologies

Modern problems of concept annotation associate an object of interest (g...
research
09/07/2023

Insights Into the Inner Workings of Transformer Models for Protein Function Prediction

Motivation: We explored how explainable AI (XAI) can help to shed light ...
research
03/19/2019

Aligning Biomedical Metadata with Ontologies Using Clustering and Embeddings

The metadata about scientific experiments published in online repositori...

Please sign up or login with your details

Forgot password? Click here to reset