On the Decision Tree Complexity of String Matching
String matching is one of the most fundamental problems in computer science. A natural problem is to find the number of characters that need to be queried (i.e. the decision tree complexity) in a string in order to determine whether this string contains a certain pattern. Rivest showed that for every pattern p, in the worst case any deterministic algorithm needs to query at least n-|p|+1 characters, where n is the length of the string and |p| is the length of the pattern. He further conjectured that these bounds are tight. By using adversary methods, Tuza disproved this conjecture and showed that more than half of binary patterns are evasive, i.e. any algorithm needs to query all the characters. In this paper, we give a query algorithm which settles the decision tree complexity for almost all patterns. Using the algebraic approach of Rivest and Vuillemin we give a new sufficient condition for the evasiveness of patterns, which reveals an interesting connection to Skolem's Problem.
READ FULL TEXT