Detecting Mutations by eBWT

05/04/2018
by   Nicola Prezza, et al.
0

In this paper we develop a theory describing how the extended Burrows-Wheeler Transform (eBWT) of a collection of DNA fragments tends to cluster together the copies of nucleotides sequenced from a genome G. Our theory accurately predicts how many copies of any nucleotide are expected inside each such cluster, and how an elegant and precise LCP array based procedure can locate these clusters in the eBWT. Our findings are very general and can be applied to a wide range of different problems. In this paper, we consider the case of alignment-free and reference-free SNPs discovery in multiple collections of reads. We note that, in accordance with our theoretical results, SNPs are clustered in the eBWT of the reads collection, and we develop a tool finding SNPs with a simple scan of the eBWT and LCP arrays. Preliminary results show that our method requires much less coverage than state-of-the-art tools while drastically improving precision and sensitivity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2023

Deep Learning for Reference-Free Geolocation for Poplar Trees

A core task in precision agriculture is the identification of climatic a...
research
05/17/2018

External memory BWT and LCP computation for sequence collections with applications

We propose an external memory algorithm for the computation of the BWT a...
research
09/10/2020

Inference for high-dimensional exchangeable arrays

We consider inference for high-dimensional exchangeable arrays where the...
research
03/15/2012

The Cost of Troubleshooting Cost Clusters with Inside Information

Decision theoretical troubleshooting is about minimizing the expected co...
research
01/18/2021

Counterexample-Guided Prophecy for Model Checking Modulo the Theory of Arrays

We develop a framework for model checking infinite-state systems by auto...
research
12/24/2012

Fully scalable online-preprocessing algorithm for short oligonucleotide microarray atlases

Accumulation of standardized data collections is opening up novel opport...

Please sign up or login with your details

Forgot password? Click here to reset