Bio-JOIE: Joint Representation Learning of Biological Knowledge Bases

03/07/2021
by   Junheng Hao, et al.
0

The widespread of Coronavirus has led to a worldwide pandemic with a high mortality rate. Currently, the knowledge accumulated from different studies about this virus is very limited. Leveraging a wide-range of biological knowledge, such as gene ontology and protein-protein interaction (PPI) networks from other closely related species presents a vital approach to infer the molecular impact of a new species. In this paper, we propose the transferred multi-relational embedding model Bio-JOIE to capture the knowledge of gene ontology and PPI networks, which demonstrates superb capability in modeling the SARS-CoV-2-human protein interactions. Bio-JOIE jointly trains two model components. The knowledge model encodes the relational facts from the protein and GO domains into separated embedding spaces, using a hierarchy-aware encoding technique employed for the GO terms. On top of that, the transfer model learns a non-linear transformation to transfer the knowledge of PPIs and gene ontology annotations across their embedding spaces. By leveraging only structured knowledge, Bio-JOIE significantly outperforms existing state-of-the-art methods in PPI type prediction on multiple species. Furthermore, we also demonstrate the potential of leveraging the learned representations on clustering proteins with enzymatic function into enzyme commission families. Finally, we show that Bio-JOIE can accurately identify PPIs between the SARS-CoV-2 proteins and human proteins, providing valuable insights for advancing research on this new disease.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2023

DDeMON: Ontology-based function prediction by Deep Learning from Dynamic Multiplex Networks

Biological systems can be studied at multiple levels of information, inc...
research
01/23/2022

OntoProtein: Protein Pretraining With Gene Ontology Embedding

Self-supervised protein language models have proved their effectiveness ...
research
08/30/2023

Inferring Compensatory Kinase Networks in Yeast using Prolog

Signalling pathways are conserved across different species, therefore ma...
research
02/07/2022

Prompt-Guided Injection of Conformation to Pre-trained Protein Model

Pre-trained protein models (PTPMs) represent a protein with one fixed em...
research
04/26/2018

MPGM: Scalable and Accurate Multiple Network Alignment

Protein-protein interaction (PPI) network alignment is a canonical opera...
research
11/12/2017

A Sequence-Based Mesh Classifier for the Prediction of Protein-Protein Interactions

The worldwide surge of multiresistant microbial strains has propelled th...
research
09/07/2018

On2Vec: Embedding-based Relation Prediction for Ontology Population

Populating ontology graphs represents a long-standing problem for the Se...

Please sign up or login with your details

Forgot password? Click here to reset