Prefix-free graphs and suffix array construction in sublinear space

06/26/2023
by   Andrej Balaz, et al.
0

A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequential data, and, together with data structures such as suffix array and Burrows-Wheeler transform, allows for efficient computation. Attempts to achieve comparatively high performance on graphs bring many complications since the common data structures on strings are not easily obtainable for graphs. In this work, we introduce prefix-free graphs, a novel pangenomic data structure; we show how to construct them and how to use them to obtain well-known data structures from stringology in sublinear space, allowing for many efficient operations on pangenomes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2017

Techniques for Constructing Efficient Lock-free Data Structures

Building a library of concurrent data structures is an essential way to ...
research
03/26/2018

Extra Space during Initialization of Succinct Data Structures and Dynamical Initializable Arrays

Many succinct data structures on the word RAM require precomputed tables...
research
04/30/2021

Speeding up Python-based Lagrangian Fluid-Flow Particle Simulations via Dynamic Collection Data Structures

Array-like collection data structures are widely established in Python's...
research
05/31/2018

Tokenized Data Markets

We formalize the construction of decentralized data markets by introduci...
research
03/26/2018

Extra Space during Initialization of Succinct Data Structures and of Dynamical Initializable Arrays

Many succinct data structures on a word RAM require precomputed tables t...
research
01/31/2023

Universal Topological Regularities of Syntactic Structures: Decoupling Efficiency from Optimization

Human syntactic structures are usually represented as graphs. Much resea...
research
02/04/2021

Optimal Construction of Hierarchical Overlap Graphs

Genome assembly is a fundamental problem in Bioinformatics, where for a ...

Please sign up or login with your details

Forgot password? Click here to reset