Efficient Computation of Sequence Mappability

07/31/2018
by   Mai Alzamel, et al.
0

Sequence mappability is an important task in genome re-sequencing. In the (k,m)-mappability problem, for a given sequence T of length n, our goal is to compute a table whose ith entry is the number of indices j i such that length-m substrings of T starting at positions i and j have at most k mismatches. Previous works on this problem focused on heuristic approaches to compute a rough approximation of the result or on the case of k=1. We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that works in O(n {m^k,^k+1 n}) time and O(n) space for k=O(1). It requires a carefu l adaptation of the technique of Cole et al. [STOC 2004] to avoid multiple counting of pairs of substrings. We also show O(n^2)-time algorithms to compute all results for a fixed m and all k=0,...,m or a fixed k and all m=k,...,n-1. Finally we show that the (k,m)-mappability problem cannot be solved in strongly subquadratic time for k,m = Θ( n) unless the Strong Exponential Time Hypothesis fails.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/26/2017

Space-Efficient Algorithms for Longest Increasing Subsequence

Given a sequence of integers, we want to find a longest increasing subse...
research
03/04/2020

Time-Space Tradeoffs for Finding a Long Common Substring

We consider the problem of finding, given two documents of total length ...
research
09/28/2018

Strong Collapse for Persistence

We introduce a fast and memory efficient approach to compute the persist...
research
02/22/2018

Complexity of the Steiner Network Problem with Respect to the Number of Terminals

In the Directed Steiner Network problem we are given an arc-weighted dig...
research
02/18/2018

Linear-Time Algorithm for Long LCF with k Mismatches

In the Longest Common Factor with k Mismatches (LCF_k) problem, we are g...
research
09/13/2023

On correlation distribution of Niho-type decimation d=3(p^m-1)+1

The cross-correlation problem is a classic problem in sequence design. I...
research
10/13/2019

Fast Fourier Sparsity Testing

A function f : F_2^n →R is s-sparse if it has at most s non-zero Fourier...

Please sign up or login with your details

Forgot password? Click here to reset