Bayesian Hidden Markov Tree Models for Clustering Genes with Shared Evolutionary History

08/18/2018
by   Yang Li, et al.
0

Determination of functions for poorly characterized genes is crucial for understanding biological processes and studying human diseases. Functionally associated genes are often gained and lost together through evolution. Therefore identifying co-evolution of genes can predict functional gene-gene associations. We describe here the full statistical model and computational strategies underlying the original algorithm, CLustering by Inferred Models of Evolution (CLIME 1.0) recently reported by us [Li et al., 2014]. CLIME 1.0 employs a mixture of tree-structured hidden Markov models for gene evolution process, and a Bayesian model-based clustering algorithm to detect gene modules with shared evolutionary histories (termed evolutionary conserved modules, or ECMs). A Dirichlet process prior was adopted for estimating the number of gene clusters and a Gibbs sampler was developed for posterior sampling. We further developed an extended version, CLIME 1.1, to incorporate the uncertainty on the evolutionary tree structure. By simulation studies and benchmarks on real data sets, we show that CLIME 1.0 and CLIME 1.1 outperform traditional methods that use simple metrics (e.g., the Hamming distance or Pearson correlation) to measure co-evolution between pairs of genes.

READ FULL TEXT

page 25

page 28

page 29

research
11/25/2021

SPAGETI: Stabilizing Phylogenetic Assessment with Gene Evolutionary Tree Indices

The standard approach to estimate species trees is to align a selected s...
research
01/01/2023

Inferring multiple consensus trees and supertrees using clustering: a review

Phylogenetic trees (i.e. evolutionary trees, additive trees or X-trees) ...
research
06/27/2012

Gene Expression Time Course Clustering with Countably Infinite Hidden Markov Models

Most existing approaches to clustering gene expression time course data ...
research
11/15/2021

Machine Learning for Genomic Data

This report explores the application of machine learning techniques on s...
research
08/09/2021

Small Parsimony for Natural Genomes in the DCJ-Indel Model

Reconstructing ancestral gene orders is an important step towards unders...
research
04/26/2021

Efficient Evolutionary Models with Digraphons

We present two main contributions which help us in leveraging the theory...
research
08/10/2017

Jumping across biomedical contexts using compressive data fusion

Motivation: The rapid growth of diverse biological data allows us to con...

Please sign up or login with your details

Forgot password? Click here to reset