Data structures to represent sets of k-long DNA sequences

03/29/2019
by   Rayan Chikhi, et al.
0

The analysis of biological sequencing data has been one of the biggest applications of string algorithms. The approaches used in many such applications are based on the analysis of k-mers, which are short fixed-length strings present in a dataset. While these approaches are rather diverse, storing and querying k-mer sets has emerged as a shared underlying component. Sets of k-mers have unique features and applications that, over the last ten years, have resulted in many specialized approaches for their representation. In this survey, we give a unified presentation and comparison of the data structures that have been proposed to store and query k-mer sets. We hope this survey will not only serve as a resource for researchers in the field but also make the area more accessible to outsiders

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2018

Fast Prefix Search in Little Space, with Applications

It has been shown in the indexing literature that there is an essential ...
research
06/24/2020

Small Longest Tandem Scattered Subsequences

We consider the problem of identifying tandem scattered subsequences wit...
research
05/25/2018

Strong link between BWT and XBW via Aho-Corasick automaton and applications to Run-Length Encoding

The boom of genomic sequencing makes compression of set of sequences ine...
research
10/07/2019

The Query Translation Landscape: a Survey

Whereas the availability of data has seen a manyfold increase in past ye...
research
07/24/2019

Exhaustive Exact String Matching: The Analysis of the Full Human Genome

Exact string matching has been a fundamental problem in computer science...
research
06/08/2021

Categorical Data Structures for Technical Computing

Many mathematical objects can be represented as functors from finitely-p...
research
07/19/2018

About BIRDS project (Bioinformatics and Information Retrieval Data Structures Analysis and Design)

BIRDS stands for "Bioinformatics and Information Retrieval Data Structur...

Please sign up or login with your details

Forgot password? Click here to reset