MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease Association Prediction

08/08/2021
by   Thi Ngan Dong, et al.
17

Growing evidence from recent studies implies that microRNA or miRNA could serve as biomarkers in various complex human diseases. Since wet-lab experiments are expensive and time-consuming, computational techniques for miRNA-disease association prediction have attracted a lot of attention in recent years. Data scarcity is one of the major challenges in building reliable machine learning models. Data scarcity combined with the use of pre-calculated hand-crafted input features has led to problems of overfitting and data leakage. We overcome the limitations of existing works by proposing a novel multi-tasking convolution-based approach, which we refer to as MuCoMiD. MuCoMiD allows automatic feature extraction while incorporating knowledge from 4 heterogeneous biological information sources (interactions between miRNA/diseases and protein-coding genes (PCG), miRNA family information, and disease ontology) in a multi-task setting which is a novel perspective and has not been studied before. The use of multi-channel convolutions allows us to extract expressive representations while keeping the model linear and, therefore, simple. To effectively test the generalization capability of our model, we construct large-scale experiments on standard benchmark datasets as well as our proposed larger independent test sets and case studies. MuCoMiD shows an improvement of at least 5 HMDDv3.0 datasets and at least 49 miRNA and diseases over state-of-the-art approaches. We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/cmtt.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/26/2021

A multitask transfer learning framework for the prediction of virus-human protein-protein interactions

Viral infections are causing significant morbidity and mortality worldwi...
research
06/01/2011

ProDiGe: PRioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples

Elucidating the genetic basis of human diseases is a central goal of gen...
research
06/22/2020

Deep Belief Network based representation learning for lncRNA-disease association prediction

Background: The expanding research in the field of long non-coding RNAs(...
research
11/15/2012

Ontology Based Information Extraction for Disease Intelligence

Disease Intelligence (DI) is based on the acquisition and aggregation of...
research
07/09/2023

Automatic Coding at Scale: Design and Deployment of a Nationwide System for Normalizing Referrals in the Chilean Public Healthcare System

The disease coding task involves assigning a unique identifier from a co...
research
06/06/2023

AVIDa-hIL6: A Large-Scale VHH Dataset Produced from an Immunized Alpaca for Predicting Antigen-Antibody Interactions

Antibodies have become an important class of therapeutic agents to treat...

Please sign up or login with your details

Forgot password? Click here to reset