Multi-scale Poisson process approaches for differential expression analysis of high-throughput sequencing data

06/25/2021
by   Heejung Shim, et al.
0

Estimating and testing for differences in molecular phenotypes (e.g. gene expression, chromatin accessibility, transcription factor binding) across conditions is an important part of understanding the molecular basis of gene regulation. These phenotypes are commonly measured using high-throughput sequencing assays (e.g., RNA-seq, ATAC-seq, ChIP-seq), which provide high-resolution count data that reflect how the phenotypes vary along the genome. Multiple methods have been proposed to help exploit these high-resolution measurements for differential expression analysis. However, they ignore the count nature of the data, instead using normal approximations that work well only for data with large sample sizes or high counts. Here we develop count-based methods to address this problem. We model the data for each sample using an inhomogeneous Poisson process with spatially structured underlying intensity function, and then, building on multi-scale models for the Poisson process, estimate and test for differences in the underlying intensity function across samples (or groups of samples). Using both simulation and real ATAC-seq data we show that our method outperforms previous normal-based methods, especially in situations with small sample sizes or low counts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2022

A flexible model for correlated count data, with application to analysis of gene expression differences in multi-condition experiments

Detecting differences in gene expression is an important part of RNA seq...
research
12/12/2020

Increased peak detection accuracy in over-dispersed ChIP-seq data with supervised segmentation models

Motivation: Histone modification constitutes a basic mechanism for the g...
research
02/12/2021

Contrastive latent variable modeling with application to case-control sequencing experiments

High-throughput RNA-sequencing (RNA-seq) technologies are powerful tools...
research
06/14/2023

MIXALIME: MIXture models for ALlelic IMbalance Estimation in high-throughput sequencing data

Modern high-throughput sequencing assays efficiently capture not only ge...
research
11/06/2018

NExUS: Bayesian simultaneous network estimation across unequal sample sizes

Network-based analyses of high-throughput genomics data provide a holist...
research
05/21/2021

High Throughput Soybean Pod-Counting with In-Field Robotic Data Collection and Machine-Vision Based Data Analysis

We report promising results for high-throughput on-field soybean pod cou...
research
09/04/2020

Investigation of the Cyprus donkey milk bacterial diversity by 16SrDNA high-throughput sequencing in a Cyprus donkey farm

The interest in milk originating from donkeys is growing worldwide due t...

Please sign up or login with your details

Forgot password? Click here to reset