Feature extraction using Spectral Clustering for Gene Function Prediction

03/25/2022
by   Miguel Romero, et al.
0

Gene annotation addresses the problem of predicting unknown associations between gene and functions (e.g., biological processes) of a specific organism. Despite recent advances, the cost and time demanded by annotation procedures that rely largely on in vivo biological experiments remain prohibitively high. This paper presents a novel in silico approach for to the annotation problem that combines cluster analysis and hierarchical multi-label classification (HMC). The approach uses spectral clustering to extract new features from the gene co-expression network (GCN) and enrich the prediction task. HMC is used to build multiple estimators that consider the hierarchical structure of gene functions. The proposed approach is applied to a case study on Zea mays, one of the most dominant and productive crops in the world. The results illustrate how in silico approaches are key to reduce the time and costs of gene annotation. More specifically, they highlight the importance of: (i) building new features that represent the structure of gene relationships in GCNs to annotate genes; and (ii) taking into account the structure of biological processes to obtain consistent predictions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2022

A Top-down Supervised Learning Approach to Hierarchical Multi-label Classification in Networks

Node classification is the task of inferring or predicting missing node ...
research
06/01/2015

Formal Concept Analysis for Knowledge Discovery from Biological Data

Due to rapid advancement in high-throughput techniques, such as microarr...
research
01/26/2017

Nonlinear network-based quantitative trait prediction from transcriptomic data

Quantitatively predicting phenotype variables by the expression changes ...
research
05/09/2012

Using the Gene Ontology Hierarchy when Predicting Gene Function

The problem of multilabel classification when the labels are related thr...
research
01/06/2021

Classification of chemical compounds based on the correlation between in vitro gene expression profiles

Toxicity evaluation of chemical compounds has traditionally relied on an...
research
12/17/2014

Gene Similarity-based Approaches for Determining Core-Genes of Chloroplasts

In computational biology and bioinformatics, the manner to understand ev...
research
12/12/2019

Pathway Activity Analysis and Metabolite Annotation for Untargeted Metabolomics using Probabilistic Modeling

Motivation: Untargeted metabolomics comprehensively characterizes small ...

Please sign up or login with your details

Forgot password? Click here to reset