Diffusion Component Analysis: Unraveling Functional Topology in Biological Networks

04/10/2015
by   Hyunghoon Cho, et al.
0

Complex biological systems have been successfully modeled by biochemical and genetic interaction networks, typically gathered from high-throughput (HTP) data. These networks can be used to infer functional relationships between genes or proteins. Using the intuition that the topological role of a gene in a network relates to its biological function, local or diffusion based "guilt-by-association" and graph-theoretic methods have had success in inferring gene functions. Here we seek to improve function prediction by integrating diffusion-based methods with a novel dimensionality reduction technique to overcome the incomplete and noisy nature of network data. In this paper, we introduce diffusion component analysis (DCA), a framework that plugs in a diffusion model and learns a low-dimensional vector representation of each node to encode the topological properties of a network. As a proof of concept, we demonstrate DCA's substantial improvement over state-of-the-art diffusion-based approaches in predicting protein function from molecular interaction networks. Moreover, our DCA framework can integrate multiple networks from heterogeneous sources, consisting of genomic information, biochemical experiments and other resources, to even further improve function prediction. Yet another layer of performance gain is achieved by integrating the DCA framework with support vector machines that take our node vector representations as features. Overall, our DCA framework provides a novel representation of nodes in a network that can be used as a plug-in architecture to other machine learning algorithms to decipher topological properties of and obtain novel insights into interactomes.

READ FULL TEXT

page 4

page 9

page 10

page 15

page 16

research
04/15/2019

Disease gene prioritization using network topological analysis from a sequence based human functional linkage network

Sequencing large number of candidate disease genes which cause diseases ...
research
04/22/2022

Gene Function Prediction with Gene Interaction Networks: A Context Graph Kernel Approach

Predicting gene functions is a challenge for biologists in the post geno...
research
08/20/2023

SBSM-Pro: Support Bio-sequence Machine for Proteins

Proteins play a pivotal role in biological systems. The use of machine l...
research
03/15/2021

SEMgraph: An R Package for Causal Network Analysis of High-Throughput Data with Structural Equation Models

With the advent of high-throughput sequencing (HTS) in molecular biology...
research
06/17/2019

rna2rna: Predicting lncRNA-microRNA-mRNA Interactions from Sequence with Integration of Interactome and Biological Annotation Data

Long non-coding RNA, microRNA, and messenger RNA enable key regulations ...
research
10/04/2022

A Structural Characterisation of the Mitogen-Activated Protein Kinase Network in Cancer

Gene regulatory networks represent collections of regulators that intera...
research
04/21/2020

Inferring Degrees from Incomplete Networks and Nonlinear Dynamics

Inferring topological characteristics of complex networks from observed ...

Please sign up or login with your details

Forgot password? Click here to reset