Increased peak detection accuracy in over-dispersed ChIP-seq data with supervised segmentation models

12/12/2020
by   Arnaud Liehrmann, et al.
0

Motivation: Histone modification constitutes a basic mechanism for the genetic regulation of gene expression. In early 2000s, a powerful technique has emerged that couples chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq). This technique provides a direct survey of the DNA regions associated to these modifications. In order to realize the full potential of this technique, increasingly sophisticated statistical algorithms have been developed or adapted to analyze the massive amount of data it generates. Many of these algorithms were built around natural assumptions such as the Poisson one to model the noise in the count data. In this work we start from these natural assumptions and show that it is possible to improve upon them. Results: The results of our comparisons on seven reference datasets of histone modifications (H3K36me3 and H3K4me3) suggest that natural assumptions are not always realistic under application conditions. We show that the unconstrained multiple changepoint detection model, with alternative noise assumptions and a suitable setup, reduces the over-dispersion exhibited by count data and turns out to detect peaks more accurately than algorithms which rely on these natural assumptions.

READ FULL TEXT
research
06/25/2021

Multi-scale Poisson process approaches for differential expression analysis of high-throughput sequencing data

Estimating and testing for differences in molecular phenotypes (e.g. gen...
research
06/14/2023

MIXALIME: MIXture models for ALlelic IMbalance Estimation in high-throughput sequencing data

Modern high-throughput sequencing assays efficiently capture not only ge...
research
06/03/2015

PeakSegJoint: fast supervised peak detection via joint segmentation of multiple count data samples

Joint peak detection is a central problem when comparing samples in geno...
research
09/24/2022

DeepChrome 2.0: Investigating and Improving Architectures, Visualizations, Experiments

Histone modifications play a critical role in gene regulation. Consequen...
research
09/02/2022

Tweaking Metasploit to Evade Encrypted C2 Traffic Detection

Command and Control (C2) communication is a key component of any structu...
research
03/07/2018

Differential Expression Analysis of Dynamical Sequencing Count Data with a Gamma Markov Chain

Next-generation sequencing (NGS) to profile temporal changes in living s...

Please sign up or login with your details

Forgot password? Click here to reset