Faster Attractor-Based Indexes
String attractors are a novel combinatorial object encompassing most known compressibility measures for highly-repetitive texts. Recently, the first index building on an attractor of size γ of a text T[1..n] was obtained. It uses O(γ(n/γ)) space and finds the occ occurrences of a pattern P[1..m] in time O(m n + occ ^ϵ n) for any constant ϵ>0. We now show how to reduce the search time to O(m + (occ+1) ^ϵ n) within the same space, and ultimately obtain the optimal O(m + occ) time within O(γ(n/γ) n) space. Further, we show how to count the number of occurrences of P in time O(m+^3+ϵ n) within O(γ(n/γ)) space, or the optimal O(m) time within O(γ(n/γ) n) space. These turn out to be the first optimal-time indexes within grammar- and Lempel-Ziv-bounded space. As a byproduct of independent interest, we show how to build, in O(n n) expected time and without knowing the size γ of the smallest attractor, a run-length context-free grammar of size O(γ(n/γ)) generating (only) T.
READ FULL TEXT