On Sensitivity of Compact Directed Acyclic Word Graphs

03/03/2023
by   Hiroto Fujimaru, et al.
0

Compact directed acyclic word graphs (CDAWGs) [Blumer et al. 1987] are a fundamental data structure on strings with applications in text pattern searching, data compression, and pattern discovery. Intuitively, the CDAWG of a string T is obtained by merging isomorphic subtrees of the suffix tree [Weiner 1973] of the same string T, thus CDAWGs are a compact indexing structure. In this paper, we investigate the sensitivity of CDAWGs when a single character edit operation (insertion, deletion, or substitution) is performed at the left-end of the input string T, namely, we are interested in the worst-case increase in the size of the CDAWG after a left-end edit operation. We prove that if e is the number of edges of the CDAWG for string T, then the number of new edges added to the CDAWG after a left-end edit operation on T is less than e. Further, we present almost matching lower bounds on the sensitivity of CDAWGs for all cases of insertion, deletion, and substitution.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset