DeepAI AI Chat
Log In Sign Up

Fast entropy-bounded string dictionary look-up with mismatches

06/25/2018
by   Paweł Gawrychowski, et al.
Akademia Sztuk Pięknych we Wrocławiu
University of Haifa
0

We revisit the fundamental problem of dictionary look-up with mismatches. Given a set (dictionary) of d strings of length m and an integer k, we must preprocess it into a data structure to answer the following queries: Given a query string Q of length m, find all strings in the dictionary that are at Hamming distance at most k from Q. Chan and Lewenstein (CPM 2015) showed a data structure for k = 1 with optimal query time O(m/w + occ), where w is the size of a machine word and occ is the size of the output. The data structure occupies O(w d ^1+ε d) extra bits of space (beyond the entropy-bounded space required to store the dictionary strings). In this work we give a solution with similar bounds for a much wider range of values k. Namely, we give a data structure that has O(m/w + ^k d + occ) query time and uses O(w d ^k d) extra bits of space.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/16/2019

Dynamic Packed Compact Tries Revisited

Given a dynamic set K of k strings of total length n whose characters ar...
08/04/2020

A Data-Structure for Approximate Longest Common Subsequence of A Set of Strings

Given a set of k strings I, their longest common subsequence (LCS) is th...
06/29/2020

Pattern Masking for Dictionary Matching

In the Pattern Masking for Dictionary Matching (PMDM) problem, we are gi...
12/21/2018

Lower bounds for text indexing with mismatches and differences

In this paper we study lower bounds for the fundamental problem of text ...
06/13/2019

On Longest Common Property Preserved Substring Queries

We revisit the problem of longest common property preserving substring q...
07/25/2019

How to Store a Random Walk

Motivated by storage applications, we study the following data structure...
10/05/2022

Internal Longest Palindrome Queries in Optimal Time

Palindromes are strings that read the same forward and backward. Problem...