1 Introduction
Formal languages have a long history of study in classical theoretical computer science, starting with the study of regular languages back to Kleene in the 1950s [12]. Roughly speaking, a formal language consists of an alphabet of letters, and a set of rules for generating words from those letters. Chomsky’s hierarchy is an early attempt to answer the following question: “Given more complex rules, what kinds of languages can we generate?”. The most wellknown types of languages in that hierarchy are the regular and contextfree languages. Modern computational complexity theory is still defined in terms of languages: complexity classes are defined as the sets of the formal languages that can be parsed by machines with certain computational powers.
The relationship between the Chomsky hierarchy and other models of computations has been studied extensively in many models, including Turing machines, probabilistic machines
[18], quantum finite automata [4], streaming algorithms [16, 5] and query complexity [2]. Query complexity is also known as the ‘black box model’, in this setting we only count the number of times that we need to query (i.e. access) the input in order to carry out our computation. It has been observed that quantum models of computation allow for significant improvements in the query complexity, when the quantum oracle access to the input bits is available [9]. We assume the reader is familiar with the basis of quantum computing. One may refer to [17] for a more detailed introduction to this topic.The recent work by Scott Aaronson, Daniel Grier, and Luke Shaeffer [1] is the first to study the relationship between the regular languages and quantum query complexity. They gives a full characterization of regular languages in the quantum query complexity model. More precisely, they show that every regular language naturally falls into one of three categories:

‘Trivial’ languages, for which membership can be decided by the first and last characters of the input string. For instance, the language describing all binary representations of even numbers is trivial.

Starfree languages, a variant of regular languages where complement is allowed ( — i.e. ‘something not in A’), but the Kleene star is not. The quantum query complexity of these languages is .

All the rest, which have quantum query complexity .
The proof uses the algebraic definitions of regular languages (i.e. in terms of monoids). Starting from an aperiodic monoid, Schützenberger constructs a starfree language recursively based on the “rank” of the monoid elements involved. [1] uses this decomposition of starfree language of higher rank into starfree languages of smaller rank to show by induction that any starfree languages has quantum query complexity. However their proof does not immediately give rise to an algorithm.
One of the starfree language mentioned in [1] is the Dyck language (with one type of parenthesis) with a constant bounded height. The Dyck language is the set of balanced strings of brackets ”(” and ”)”. When at any point the number of opening parentheses exceeds the number of closing parentheses by at most , we denote the language as .
The Dyck language is a fundamental example of a contextfree language that is not regular. When more types of parenthesis are allowed, the famous Chomsky–Schützenberger representation theorem shows that any contextfree language is the homomorphic image of the intersection of Dyck language and a regular language.
Contributions
We give an explicit algorithm (see Theorem 15) for the decision problem of with quantum queries. The algorithm also works when is not a constant and is better than the trival upper bound when . We note that when is not a constant, that is, if the height is allowed to depend on the length of the word, is not contextfree anymore, therefore previous results do not apply. We also obtain lower bounds on the quantum query complexity. We show (Theorem 21) that for every , there exists such that . When , the quantum query complexity is close to , i.e. for all , see Theorem 20. Furthermore when for some , we show (Theorem 19) that . Similar lower bounds were recently independently proven by Ambainis, Balodis, Iraids, Prūsis, and Smotrovs [3], and Buhrman, Patro and Speelman [8].
Notation
The Dyck language is the set of balanced strings of brackets ”(” and ”)”. When at any point the number of opening parentheses exceeds the number of closing parentheses by at most , we denote the language as . is the set of words of length in . can be a function of . For readability reason, we define and . The alphabet is thus . For all where , we define as the length of , and . We call the balance. For all , we define . Finally, we define and . We also define the function such that if , and if , if .
Structure of the paper
In the next section, we give an algorithm of quantum query complexity for . In the following section, we show some lower bounds when is .
2 An algorithm for
2.1 Substring Search algorithm
The goal of this section is to describe quantum algorithm for finding a substring that has a balance for some integer . This algorithm is the basis of our algorithms for languages.
We describe a substring Search algorithm in Section 2.1.1, a substring Search algorithm in Section 2.1.2, a substring Search algorithm in Section 2.1.3 and then we will finish with an algorithm for the general case in Section 2.1.4
2.1.1 Substring Search Algorithm
The simplest case is an algorithm that searches for a substring that has a balance . The algorithm looks for two sequential equal symbols using Grover’s Search Algorithm [10, 7]. Formally, it is a procedure that accepts the following parameters as inputs and outputs:

Inputs:

an integer which is a left border for the substring to be searched. Here .

an integer which is a right border for the substring to be searched.

a set which represents the sign of the balance for a substring to be searched.


Outputs:

a triple where and are the left and right border of the found substring and where is the sign of , i.e. . If there is no such substring, then the algorithm returns . Furthermore, when there is a satisfying substring, the result is such that .

The algorithm searches for a substring such that and .
We use a Grover’s Search Algorithm as a subroutine that takes as inputs and as left and right borders of the search space and some function . We search for any index , where , such that . The result of the function is either some index or if it has not found the required .
Lemma 1.
Proof.
The main part of the algorithm is the Grover’s Search algorithm that has running time and error probability. ∎
It will be useful to consider a modification of the algorithm that finds not just any substring, but the closest to the left border or to the right border. In that case, we use a subroutine Grover_First_One, with parameters that accepts and as left and right borders of the search space, a function and .

If , then we search for the maximal index such that where .

If , then we search for the minimal index such that where .
The result of the function is either or if it has not found the required . See [13, 14, 15] on how to implement such a function.
Algorithm 2 implements the subroutine. It has the same input and output parameters as and an extra input .
Lemma 2.
If the required segment exists, the expected running time of Algorithm 2 is , where is the furthest border of the searching segment. Otherwise, the running time is . The error probability is at most .
2.1.2 Substring Search Algorithm
We now discuss an algorithm that searches for a substring that has a balance . The algorithm searches for two substrings and such that there are no substrings between them. If both substrings and are substring, then we get substring in total. If both substrings are substring, then we get substring in total.
Firstly, we discuss a basic procedure for the algorithm that can fail. To search it we do the following steps for some integer . Assume that the procedure searches for a substring in the segment , where .

Randomly pick a position .

Search for the first substring on the right at distance at most , i.e. in the segment . If the algorithm does not find it, then it fails. Otherwise the segment is the substring.

Search for the first substring on the left from at distance at most , i.e. in the segment . If the algorithm finds the substring and it has the same balance as the first substring, i.e. , then we assign and and go to Step . Otherwise, the algorithm goes to Step .

If we have not found a substring on the left, then we search for the first substring to the right from at distance at most , i.e. in the segment . If we do not find it, then the algorithm fails. Otherwise it outputs the substring.

If the algorithm finds the substrings and such that and , then is the answer, otherwise the algorithm fails.
Algorithm 3 implements this procedure and accepts as inputs:

the borders and , where and are integers such that .

the position

the maximal distance , where is an integer such that .

the sign of balance of borders . is used for searching substring, is used for searching substring, is used for searching both.
Lemma 3.
Proof.
Let us prove the correctness of the algorithm in the case of picking a point inside the search substring. Let us consider the case of substring, the second case is similar. Assume that the substring to be searched is . There are two cases:

Assume that . This means that and .

If , then the first invocation of procedure finds and the second invocation of finds in the case of .

If , then the first invocation of procedure finds and the third invocation of finds in the case of .

If , then the first invocation of procedure finds and the second invocation of finds in the case of .


Assume that . This means that and .

If , then the first invocation of procedure finds and the second invocation of finds in the case of .

If , then the first invocation of procedure finds and the third invocation of finds in the case of .

Due to Lemma 2.1.1, the running time of each invocation is .
The probability of piking inside the required segment is the length of the segment over the length of searching space, i.e. . ∎
We now provide an algorithm to search for any substring of fixed length with high probability. Algorithm 3 succeeds with probability . We can use the amplitude amplification algorithm [7] which is a generalization of Grover’s search algorithm [10] and boost the success probability to a higher probability. We should invoke the base algorithm times, but we do not know that depends on the unknown . Therefore we invoke it times. Let us call this procedure . It accepts the same parameters as except for the extra position .
We can now write an algorithm to search for any substring. We choose the length as a power of and search for substrings of such length. We start with since the minimal length of substrings is and is the smallest power of that is greater than . Algorithm 4 accepts the following parameters:

the borders and , where and are integers such that .

the sign of balance of borders . is used for searching substring, is used for searching substring, is used for searching both.
Lemma 4.
Algorithm 4 finds some substring with probability at least and running time , where is the length of the shortest substring .
Proof.
Assume that shortest substring is such that the length . The first invocation of that finds the substring will be in the case . The working time of is due to the complexity of Amplitude amplification and Lemma 3.
So, . The total running time is because before reaching the algorithm will do steps of the loop.
We can now estimate the success probability. The number of steps of Amplitude Amplification algorithm in a case of
is at most more that it is required because the highest probability is for steps. That is why we get as a probability of success. ∎Now consider the algorithm that finds the first substring. The idea of the algorithm is similar to the first one search algorithm from [13, 14, 15]. We search a substring in the segment of length that are power of . Assume that the answer is and we search it on the left in the segment , then the first time when we find the substring is the case .
Algorithm 5 has the following property.
Lemma 5.
The expected running time of Algorithm 5 is , where is the most far border of the searching segment or the running time is if there is no required segment. The error probability is at most .
2.1.3 Substring Search Algorithm
Let us now discuss an algorithm that searches for a substring that has a balance . The algorithm searches for two substrings and such that there are no substrings between them. If both substrings and are substring, then we get substring in total. If both substrings are substring, then we get substring in total.
The scheme of the algorithm is similar to the substring search algorithm. We briefly review the common parts of the two algorithms.
The basic procedure for the algorithm that can fail is searching a substring of length at most in the segment , where .

Randomly pick a position .

Search for the first substring on the right at distance at most , i.e. in . The result is or fail.

Search for the first substring on the left from at distance at most , i.e. in . The result is . If , then and and go to Step . Otherwise, the algorithm goes to Step .

Search for the first substring on the right from at distance at most , i.e. in . The result is substring or procedure fails.

If the algorithm finds the substrings and such that and , then is the answer, otherwise the algorithm fails.
Algorithm 6 implements this procedure and its input parameters are same as for Algorithm 3
Let us discuss the property of the algorithm.
Lemma 6.
Proof.
Let us prove the correctness of the algorithm in the case of picking a point inside the search substring. There are different cases when searching for a substring .

Assume that there is and such that and .

If , then the first invocation of procedure finds and the second invocation of finds in the case of .

If , then the first invocation of procedure finds and the third invocation of finds in the case of .

If , then the first invocation of procedure finds and the second invocation of finds in the case of .


Assume that there is and such that and .

If , then the first invocation of procedure finds and the second invocation of finds in the case of .

If , then the first invocation of procedure finds and the third invocation of finds in the case of .

Due to Lemma 2.1.2, the running time of each invocation is .
The probability of picking inside the required segment is the length of the segment over the length of searching space, i.e. . ∎
We now provide an algorithm to search for any substring of a fixed length with high probability. Algorithm 6 succeeds with probability . As for substring, we use the amplitude amplification algorithm [7]. We should invoke the base algorithm times, at the same time we do not know before. That is why we invoke it times. Let us call this procedure . It accepts the same parameters as except the position .
Finally, we build an algorithm to search for any substring. We choose powers of as a lengths and searches substrings of such length. We starts with because the minimal length of substrings is . Algorithm 7 accepts the same parameters as Algorithm 4
Let us discuss the property of Algorithm 7.
Lemma 7.
Algorithm 7 finds some substring with probability at least and running time , where is the length of the shortest substring .
Proof.
Assume that shortest substring is such that the length . The first invocation of that finds the substring will be in the case . The working time of is due to the complexity of Amplitude amplification and Lemma 6.
So, . The total running time is because before reaching the algorithm will do steps of the loop.
Let us estimate the success probability. The number of steps of Amplitude Amplification algorithm in a case of is at most more that it is required because the highest probability is for steps. That is why we get as a probability of success. ∎
Let us consider the algorithm that finds the first substring. The idea of the algorithm is similar to the first substring and the first one search algorithm from [13, 14, 15]. We search for a substring in the segment of length that are power of . Assume that the answer is and we search it on the left in the segment , then the first time when we find the substring is the case .
Comments
There are no comments yet.