DeepAI AI Chat
Log In Sign Up

Fast entropy-bounded string dictionary look-up with mismatches

by   Paweł Gawrychowski, et al.
Akademia Sztuk Pięknych we Wrocławiu
University of Haifa

We revisit the fundamental problem of dictionary look-up with mismatches. Given a set (dictionary) of d strings of length m and an integer k, we must preprocess it into a data structure to answer the following queries: Given a query string Q of length m, find all strings in the dictionary that are at Hamming distance at most k from Q. Chan and Lewenstein (CPM 2015) showed a data structure for k = 1 with optimal query time O(m/w + occ), where w is the size of a machine word and occ is the size of the output. The data structure occupies O(w d ^1+ε d) extra bits of space (beyond the entropy-bounded space required to store the dictionary strings). In this work we give a solution with similar bounds for a much wider range of values k. Namely, we give a data structure that has O(m/w + ^k d + occ) query time and uses O(w d ^k d) extra bits of space.


page 1

page 2

page 3

page 4


Dynamic Packed Compact Tries Revisited

Given a dynamic set K of k strings of total length n whose characters ar...

A Data-Structure for Approximate Longest Common Subsequence of A Set of Strings

Given a set of k strings I, their longest common subsequence (LCS) is th...

Pattern Masking for Dictionary Matching

In the Pattern Masking for Dictionary Matching (PMDM) problem, we are gi...

Lower bounds for text indexing with mismatches and differences

In this paper we study lower bounds for the fundamental problem of text ...

On Longest Common Property Preserved Substring Queries

We revisit the problem of longest common property preserving substring q...

How to Store a Random Walk

Motivated by storage applications, we study the following data structure...

Internal Longest Palindrome Queries in Optimal Time

Palindromes are strings that read the same forward and backward. Problem...