On Sensitivity of Compact Directed Acyclic Word Graphs

03/03/2023
by   Hiroto Fujimaru, et al.
0

Compact directed acyclic word graphs (CDAWGs) [Blumer et al. 1987] are a fundamental data structure on strings with applications in text pattern searching, data compression, and pattern discovery. Intuitively, the CDAWG of a string T is obtained by merging isomorphic subtrees of the suffix tree [Weiner 1973] of the same string T, thus CDAWGs are a compact indexing structure. In this paper, we investigate the sensitivity of CDAWGs when a single character edit operation (insertion, deletion, or substitution) is performed at the left-end of the input string T, namely, we are interested in the worst-case increase in the size of the CDAWG after a left-end edit operation. We prove that if e is the number of edges of the CDAWG for string T, then the number of new edges added to the CDAWG after a left-end edit operation on T is less than e. Further, we present almost matching lower bounds on the sensitivity of CDAWGs for all cases of insertion, deletion, and substitution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2021

Sensitivity of string compressors and repetitiveness measures

The sensitivity of a string compression algorithm C asks how much the ou...
research
01/03/2021

Text Searching Allowing for Non-Overlapping Adjacent Unbalanced Translocations

In this paper we investigate the approximate string matching problem whe...
research
02/17/2020

DAWGs for parameterized matching: online construction and related indexing structures

Two strings x and y over Σ∪Π of equal length are said to parameterized m...
research
12/02/2018

Sequence Searching Allowing for Non-Overlapping Adjacent Unbalanced Translocations

Unbalanced translocations are among the most frequent chromosomal altera...
research
10/28/2018

Near-Linear Time Insertion-Deletion Codes and (1+ε)-Approximating Edit Distance via Indexing

We introduce fast-decodable indexing schemes for edit distance which can...
research
08/04/2023

Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph

In this paper, we present the first study of the computational complexit...
research
07/15/2022

Matching Patterns with Variables Under Edit Distance

A pattern α is a string of variables and terminal letters. We say that α...

Please sign up or login with your details

Forgot password? Click here to reset