MARIA: Multiple-alignment r-index with aggregation

09/19/2022
by   Adrian Goga, et al.
0

There now exist compact indexes that can efficiently list all the occurrences of a pattern in a dataset consisting of thousands of genomes, or even all the occurrences of all the pattern's maximal exact matches (MEMs) with respect to the dataset. Unless we are lucky and the pattern is specific to only a few genomes, however, we could be swamped by hundreds of matches – or even hundreds per MEM – only to discover that most or all of the matches are to substrings that occupy the same few columns in a multiple alignment. To address this issue, in this paper we present a simple and compact data index MARIA that stores a multiple alignment such that, given the position of one match of a pattern (or a MEM or other substring of a pattern) and its length, we can quickly list all the distinct columns of the multiple alignment where matches start.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2022

MONI can find k-MEMs

Maximal exact matches (MEMs) have been widely used in bioinformatics at ...
research
05/03/2022

Computing Maximal Unique Matches with the r-index

In recent years, pangenomes received increasing attention from the scien...
research
08/21/2022

Teaching the Burrows-Wheeler Transform via the Positional Burrows-Wheeler Transform

The Burrows-Wheeler Transform (BWT) is often taught in undergraduate cou...
research
12/29/2017

RedDwarfData: a simplified dataset of StarCraft matches

The game Starcraft is one of the most interesting arenas to test new mac...
research
08/13/2019

Beyond the Inverted Index

In this paper, a new data structure named group-list is proposed. The gr...
research
11/24/2022

A fast and simple O (z log n)-space index for finding approximately longest common substrings

We describe how, given a text T [1..n] and a positive constant ϵ, we can...
research
10/31/2017

Extracting Syntactic Patterns from Databases

Many database columns contain string or numerical data that conforms to ...

Please sign up or login with your details

Forgot password? Click here to reset