 # Quantum Query Complexity of Dyck Languages with Bounded Height

We consider the problem of determining if a sequence of parentheses is well parenthesized, with a depth of at most h. We denote this language as Dyck_h. We study the quantum query complexity of this problem for different h as function of the length n of the word. It has been known from a recent paper by Aaronson et al. that, for any constant h, since Dyck_h is star-free, it has quantum query complexity Θ̃(√(n)), where the hidden logarithm factors in Θ̃ depend on h. Their proof does not give rise to an algorithm. When h is not a constant, Dyck_h is not even context-free. We give an algorithm with O(√(n)log(n)^h-1) quantum queries for Dyck_h for all h. This is better than the trival upper bound n when h=o(log(n)/loglog n). We also obtain lower bounds: we show that for every 0<ϵ≤ 0.37, there exists c>0 such that Q(Dyck_clog(n)(n))=Ω(n^1-ϵ). When h=ω(log(n)), the quantum query complexity is close to n, i.e. Q(Dyck_h(n))=ω(n^1-ϵ) for all ϵ>0. Furthermore when h=Ω(n^ϵ) for some ϵ>0, Q(Dyck_h(n))=Θ(n).

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Formal languages have a long history of study in classical theoretical computer science, starting with the study of regular languages back to Kleene in the 1950s . Roughly speaking, a formal language consists of an alphabet of letters, and a set of rules for generating words from those letters. Chomsky’s hierarchy is an early attempt to answer the following question: “Given more complex rules, what kinds of languages can we generate?”. The most well-known types of languages in that hierarchy are the regular and context-free languages. Modern computational complexity theory is still defined in terms of languages: complexity classes are defined as the sets of the formal languages that can be parsed by machines with certain computational powers.

The relationship between the Chomsky hierarchy and other models of computations has been studied extensively in many models, including Turing machines, probabilistic machines

, quantum finite automata , streaming algorithms [16, 5] and query complexity . Query complexity is also known as the ‘black box model’, in this setting we only count the number of times that we need to query (i.e. access) the input in order to carry out our computation. It has been observed that quantum models of computation allow for significant improvements in the query complexity, when the quantum oracle access to the input bits is available . We assume the reader is familiar with the basis of quantum computing. One may refer to  for a more detailed introduction to this topic.

The recent work by Scott Aaronson, Daniel Grier, and Luke Shaeffer  is the first to study the relationship between the regular languages and quantum query complexity. They gives a full characterization of regular languages in the quantum query complexity model. More precisely, they show that every regular language naturally falls into one of three categories:

• ‘Trivial’ languages, for which membership can be decided by the first and last characters of the input string. For instance, the language describing all binary representations of even numbers is trivial.

• Star-free languages, a variant of regular languages where complement is allowed ( — i.e. ‘something not in A’), but the Kleene star is not. The quantum query complexity of these languages is .

• All the rest, which have quantum query complexity .

The proof uses the algebraic definitions of regular languages (i.e. in terms of monoids). Starting from an aperiodic monoid, Schützenberger constructs a star-free language recursively based on the “rank” of the monoid elements involved.  uses this decomposition of star-free language of higher rank into star-free languages of smaller rank to show by induction that any star-free languages has quantum query complexity. However their proof does not immediately give rise to an algorithm.

One of the star-free language mentioned in  is the Dyck language (with one type of parenthesis) with a constant bounded height. The Dyck language is the set of balanced strings of brackets ”(” and ”)”. When at any point the number of opening parentheses exceeds the number of closing parentheses by at most , we denote the language as .

The Dyck language is a fundamental example of a context-free language that is not regular. When more types of parenthesis are allowed, the famous Chomsky–Schützenberger representation theorem shows that any context-free language is the homomorphic image of the intersection of Dyck language and a regular language.

##### Contributions

We give an explicit algorithm (see Theorem 15) for the decision problem of with quantum queries. The algorithm also works when is not a constant and is better than the trival upper bound when . We note that when is not a constant, that is, if the height is allowed to depend on the length of the word, is not context-free anymore, therefore previous results do not apply. We also obtain lower bounds on the quantum query complexity. We show (Theorem 21) that for every , there exists such that . When , the quantum query complexity is close to , i.e. for all , see Theorem 20. Furthermore when for some , we show (Theorem 19) that . Similar lower bounds were recently independently proven by Ambainis, Balodis, Iraids, Prūsis, and Smotrovs , and Buhrman, Patro and Speelman .

##### Notation

The Dyck language is the set of balanced strings of brackets ”(” and ”)”. When at any point the number of opening parentheses exceeds the number of closing parentheses by at most , we denote the language as . is the set of words of length in . can be a function of . For readability reason, we define and . The alphabet is thus . For all where , we define as the length of , and . We call the balance. For all , we define . Finally, we define and . We also define the function such that if , and if , if .

##### Structure of the paper

In the next section, we give an algorithm of quantum query complexity for . In the following section, we show some lower bounds when is .

## 2 An algorithm for Dyckh

### 2.1 ±k-Substring Search algorithm

The goal of this section is to describe quantum algorithm for finding a substring that has a balance for some integer . This algorithm is the basis of our algorithms for languages.

We describe a -substring Search algorithm in Section 2.1.1, a -substring Search algorithm in Section 2.1.2, a -substring Search algorithm in Section 2.1.3 and then we will finish with an algorithm for the general case in Section 2.1.4

#### 2.1.1 ±2-Substring Search Algorithm

The simplest case is an algorithm that searches for a substring that has a balance . The algorithm looks for two sequential equal symbols using Grover’s Search Algorithm [10, 7]. Formally, it is a procedure that accepts the following parameters as inputs and outputs:

• Inputs:

• an integer which is a left border for the substring to be searched. Here .

• an integer which is a right border for the substring to be searched.

• a set which represents the sign of the balance for a substring to be searched.

• Outputs:

• a triple where and are the left and right border of the found substring and where is the sign of , i.e. . If there is no such substring, then the algorithm returns . Furthermore, when there is a satisfying substring, the result is such that .

The algorithm searches for a substring such that and .

We use a Grover’s Search Algorithm as a subroutine that takes as inputs and as left and right borders of the search space and some function . We search for any index , where , such that . The result of the function is either some index or if it has not found the required .

In Algorithm 1, we use Grover’s search on function defined by

 Fs(i)=1⇔(ui=ui+1 or ui=ui−1) and the following conditions hold:
• if then .

• if then .

• if then or .

###### Lemma 1.

The running time of Algorithm 1 is

. The error probability is

###### Proof.

The main part of the algorithm is the Grover’s Search algorithm that has running time and error probability. ∎

It will be useful to consider a modification of the algorithm that finds not just any substring, but the closest to the left border or to the right border. In that case, we use a subroutine Grover_First_One, with parameters that accepts and as left and right borders of the search space, a function and .

• If , then we search for the maximal index such that where .

• If , then we search for the minimal index such that where .

The result of the function is either or if it has not found the required . See [13, 14, 15] on how to implement such a function.

Algorithm 2 implements the subroutine. It has the same input and output parameters as and an extra input .

###### Lemma 2.

If the required segment exists, the expected running time of Algorithm 2 is , where is the furthest border of the searching segment. Otherwise, the running time is . The error probability is at most .

###### Proof.

The main part of the algorithm is the Grover_First_One algorithm [13, 14, 15] that has expected running time and at most error probability. The running time is if there is no required segment. ∎

#### 2.1.2 ±3-Substring Search Algorithm

We now discuss an algorithm that searches for a substring that has a balance . The algorithm searches for two -substrings and such that there are no -substrings between them. If both substrings and are -substring, then we get -substring in total. If both substrings are -substring, then we get -substring in total.

Firstly, we discuss a basic procedure for the algorithm that can fail. To search it we do the following steps for some integer . Assume that the procedure searches for a substring in the segment , where .

• Randomly pick a position .

• Search for the first -substring on the right at distance at most , i.e. in the segment . If the algorithm does not find it, then it fails. Otherwise the segment is the substring.

• Search for the first -substring on the left from at distance at most , i.e. in the segment . If the algorithm finds the substring and it has the same balance as the first substring, i.e. , then we assign and and go to Step . Otherwise, the algorithm goes to Step .

• If we have not found a -substring on the left, then we search for the first -substring to the right from at distance at most , i.e. in the segment . If we do not find it, then the algorithm fails. Otherwise it outputs the substring.

• If the algorithm finds the substrings and such that and , then is the answer, otherwise the algorithm fails.

Algorithm 3 implements this procedure and accepts as inputs:

• the borders and , where and are integers such that .

• the position

• the maximal distance , where is an integer such that .

• the sign of balance of borders . is used for searching -substring, is used for searching -substring, is used for searching both.

###### Lemma 3.

If Algorithm 3 picks the starting point inside the search substring, then it will find it. The expected running time of Algorithm 3 is . The probability of success is at least , where is a length of -substring.

###### Proof.

Let us prove the correctness of the algorithm in the case of picking a point inside the search substring. Let us consider the case of -substring, the second case is similar. Assume that the substring to be searched is . There are two cases:

1. Assume that . This means that and .

• If , then the first invocation of procedure finds and the second invocation of finds in the case of .

• If , then the first invocation of procedure finds and the third invocation of finds in the case of .

• If , then the first invocation of procedure finds and the second invocation of finds in the case of .

2. Assume that . This means that and .

• If , then the first invocation of procedure finds and the second invocation of finds in the case of .

• If , then the first invocation of procedure finds and the third invocation of finds in the case of .

Due to Lemma 2.1.1, the running time of each invocation is .

The probability of piking inside the required segment is the length of the segment over the length of searching space, i.e. . ∎

We now provide an algorithm to search for any -substring of fixed length with high probability. Algorithm 3 succeeds with probability . We can use the amplitude amplification algorithm  which is a generalization of Grover’s search algorithm  and boost the success probability to a higher probability. We should invoke the base algorithm times, but we do not know that depends on the unknown . Therefore we invoke it times. Let us call this procedure . It accepts the same parameters as except for the extra position .

We can now write an algorithm to search for any -substring. We choose the length as a power of and search for -substrings of such length. We start with since the minimal length of -substrings is and is the smallest power of that is greater than . Algorithm 4 accepts the following parameters:

• the borders and , where and are integers such that .

• the sign of balance of borders . is used for searching -substring, is used for searching -substring, is used for searching both.

###### Lemma 4.

Algorithm 4 finds some -substring with probability at least and running time , where is the length of the shortest -substring .

###### Proof.

Assume that shortest -substring is such that the length . The first invocation of that finds the substring will be in the case . The working time of is due to the complexity of Amplitude amplification and Lemma 3.

So, . The total running time is because before reaching the algorithm will do steps of the loop.

We can now estimate the success probability. The number of steps of Amplitude Amplification algorithm in a case of

is at most more that it is required because the highest probability is for steps. That is why we get as a probability of success. ∎

Now consider the algorithm that finds the first -substring. The idea of the algorithm is similar to the first one search algorithm from [13, 14, 15]. We search a -substring in the segment of length that are power of . Assume that the answer is and we search it on the left in the segment , then the first time when we find the substring is the case .

Algorithm 5 implements this procedure with the same arguments as a procedure in Algorithm 2.

Algorithm 5 has the following property.

###### Lemma 5.

The expected running time of Algorithm 5 is , where is the most far border of the searching segment or the running time is if there is no required segment. The error probability is at most .

###### Proof.

We can show the properties similarly to [13, 14, 15]. ∎

#### 2.1.3 ±4-Substring Search Algorithm

Let us now discuss an algorithm that searches for a substring that has a balance . The algorithm searches for two -substrings and such that there are no -substrings between them. If both substrings and are -substring, then we get -substring in total. If both substrings are -substring, then we get -substring in total.

The scheme of the algorithm is similar to the -substring search algorithm. We briefly review the common parts of the two algorithms.

The basic procedure for the algorithm that can fail is searching a substring of length at most in the segment , where .

• Randomly pick a position .

• Search for the first -substring on the right at distance at most , i.e. in . The result is or fail.

• Search for the first -substring on the left from at distance at most , i.e. in . The result is . If , then and and go to Step . Otherwise, the algorithm goes to Step .

• Search for the first -substring on the right from at distance at most , i.e. in . The result is substring or procedure fails.

• If the algorithm finds the substrings and such that and , then is the answer, otherwise the algorithm fails.

Algorithm 6 implements this procedure and its input parameters are same as for Algorithm 3

Let us discuss the property of the algorithm.

###### Lemma 6.

If Algorithm 6 picks the starting point inside the searching substring, then it will find it. The expected running time of Algorithm 6 is . The probability of success is at least , where is a length of -substring.

###### Proof.

Let us prove the correctness of the algorithm in the case of picking a point inside the search substring. There are different cases when searching for a -substring .

1. Assume that there is and such that and .

• If , then the first invocation of procedure finds and the second invocation of finds in the case of .

• If , then the first invocation of procedure finds and the third invocation of finds in the case of .

• If , then the first invocation of procedure finds and the second invocation of finds in the case of .

2. Assume that there is and such that and .

• If , then the first invocation of procedure finds and the second invocation of finds in the case of .

• If , then the first invocation of procedure finds and the third invocation of finds in the case of .

Due to Lemma 2.1.2, the running time of each invocation is .

The probability of picking inside the required segment is the length of the segment over the length of searching space, i.e. . ∎

We now provide an algorithm to search for any -substring of a fixed length with high probability. Algorithm 6 succeeds with probability . As for -substring, we use the amplitude amplification algorithm . We should invoke the base algorithm times, at the same time we do not know before. That is why we invoke it times. Let us call this procedure . It accepts the same parameters as except the position .

Finally, we build an algorithm to search for any -substring. We choose powers of as a lengths and searches -substrings of such length. We starts with because the minimal length of -substrings is . Algorithm 7 accepts the same parameters as Algorithm 4

Let us discuss the property of Algorithm 7.

###### Lemma 7.

Algorithm 7 finds some -substring with probability at least and running time , where is the length of the shortest -substring .

###### Proof.

Assume that shortest -substring is such that the length . The first invocation of that finds the substring will be in the case . The working time of is due to the complexity of Amplitude amplification and Lemma 6.

So, . The total running time is because before reaching the algorithm will do steps of the loop.

Let us estimate the success probability. The number of steps of Amplitude Amplification algorithm in a case of is at most more that it is required because the highest probability is for steps. That is why we get as a probability of success. ∎

Let us consider the algorithm that finds the first -substring. The idea of the algorithm is similar to the first -substring and the first one search algorithm from [13, 14, 15]. We search for a -substring in the segment of length that are power of . Assume that the answer is and we search it on the left in the segment , then the first time when we find the substring is the case .

Procedure in Algorithm 8 implements this idea and takes the same arguments as procedure in Algorithm 5.