Generalized Functional Pruning Optimal Partitioning (GFPOP) for Constrained Changepoint Detection in Genomic Data

09/29/2018
by   Toby Dylan Hocking, et al.
0

We describe a new algorithm and R package for peak detection in genomic data sets using constrained changepoint algorithms. These detect changes from background to peak regions by imposing the constraint that the mean should alternately increase then decrease. An existing algorithm for this problem exists, and gives state-of-the-art accuracy results, but it is computationally expensive when the number of changes is large. We propose the GFPOP algorithm that jointly estimates the number of peaks and their locations by minimizing a cost function which consists of a data fitting term and a penalty for each changepoint. Empirically this algorithm has a cost that is O(N (N)) for analysing data of length N. We also propose a sequential search algorithm that finds the best solution with K segments in O((K)N (N)) time, which is much faster than the previous O(KN (N)) algorithm. We show that our disk-based implementation in the PeakSegDisk R package can be used to quickly compute constrained optimal models with many changepoints, which are needed to analyze typical genomic data sets that have tens of millions of observations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2017

A log-linear time algorithm for constrained changepoint detection

Changepoint detection is a central problem in time series and genomic da...
research
06/03/2015

PeakSegJoint: fast supervised peak detection via joint segmentation of multiple count data samples

Joint peak detection is a central problem when comparing samples in geno...
research
10/05/2022

Functional Labeled Optimal Partitioning

Peak detection is a problem in sequential data analysis that involves di...
research
08/23/2022

cpop: Detecting changes in piecewise-linear signals

Changepoint detection is an important problem with applications across m...
research
12/11/2014

Efficient penalty search for multiple changepoint problems

In the multiple changepoint setting, various search methods have been pr...
research
09/02/2020

A new heuristic algorithm for fast k-segmentation

The k-segmentation of a video stream is used to partition it into k piec...
research
02/09/2023

A Constant-per-Iteration Likelihood Ratio Test for Online Changepoint Detection for Exponential Family Models

Online changepoint detection algorithms that are based on likelihood-rat...

Please sign up or login with your details

Forgot password? Click here to reset