1 Introduction
Different tree data structures are wellknown and used in different algorithms. At the same time, when we construct algorithms with random behavior like randomized and quantum algorithms, we should consider error probability. We suggest a general method for updating a tree data structure in the noisy case, and we call the method
Walking tree.For a tree of height , we consider an operation that is processing all nodes from the root to a target node. Assume that the running time is , where is processinganode time. Then, in the noisy case, our technique allows us to do it in , where is the error probability for the whole operation. We mean navigation by tree procedures with error probability by a noisy case. Note that the standard way to handle this situation is the success probability boosting technique (repetition of the noisy action) that gives us complexity.
The technique is based on results for the noisy Binary search algorithm from [14]. Note that different algorithms for a noisy tree and graph processing and search were considered in [24, 13, 12, 10, 5, 11].
We apply the technique to two tree data structures. The first one is RedBlack tree [18, 8] that is an implementation of selfbalanced binary search tree [8]. If the key comparing procedure has an error probability at most for some , then our noisy selfbalanced binary search tree allows us to do adding, removing, and searching operations in running time, where is the error probability for a whole operation and is the number of nodes in the tree. In a case of , we have running time. So, in that case, the noisy key comparing procedure does not affect running time (asymptotically). At the same time, if we use the success probability boosting technique, then the running time is .
The second one is Segment tree [39, 36]. If the indexes comparing procedure has an error probability at most for some , then our noisy segment tree allows us to do update and request operations in running time, where is the error probability for a whole operation and is the number of leaves. In a case of , we have running time. So, in that case, the noisy indexes comparing procedure does not affect running time (asymptotically). At the same time, if we use the success probability boosting technique, then the running time is .
We use these two data structures in quantum algorithms for two problems. Quantum computation [3, 42, 1] is one of hot topics in last decades. There are many problems including graph problems where we can obtain a quantum speedup. Some of them can be founded here [48, 22, 27, 29, 25, 45, 15, 2, 40]. Quantum algorithms have randomized behavior, so it is important to use noisy data structures for this model.
The first problem that we consider is the Exam problem. We have a group of students that came to an exam. We process queries that can have one of three types:

a new student comes to the group (an input is a name of the student);

a student understands that the exam is too hard and leaves the group (an input is a name of the student);

a professor asks a question from the student with the lexicographical minimal name, and this student leaves the group.
We solve the problem using quantum string comparing algorithm [26, 28, 23, 32, 15] and noisy selfbalanced binary search tree. The running time of the quantum algorithm is , where is the length of names of students. At the same time, the best classical algorithm based on trie(prefix tree) [9, 4, 7, 34] has running time. So, we obtain quantum speedup in a case of .
The second problem is The Largest File Problem. Assume that we have a file system where a file has a string address. Each moment we have an event of one of three types:

There is not enough space in our data storage, and we want to obtain the largest file in some range of addresses.

A new file is added to the file system.

A file is removed from the file system.
We solve the problem using a quantum string comparing algorithm and noisy segment tree. The running time of the quantum algorithm is , where is the length of addresses.
The third problem is the String Sorting problem. Assume that we have strings of length . It is known [20, 21] that no quantum algorithm can sort arbitrary comparable objects faster than . At the same time, several researchers tried to improve the hidden constant [44, 43]. Other researchers investigated space bounded case [33]. We focus on sorting strings. In a classical case, we can use an algorithm that is better than arbitrary comparable objects sorting algorithms. It is radix sort that has query complexity [8] for a finite size alphabet. It is also a lower bound for classical (randomized or deterministic) algorithms that is . There is a quantum algorithm [26, 28] for the string sorting problem that has query complexity , where does not consider log factors. We suggest simpler implementation based on noisy redblack tree. These algorithms show quantum speedup. Additionally, we show the lower bound for quantum algorithms that is . So, our and existing quantum solutions are optimal up to log factor.
The structure of this paper is the following. Section 2 contains preliminaries. We present the main technique in Section 3. Section 4 contains a discussion of the noisy selfbalanced binary search tree and the noisy segment tree. Application of these data structures in quantum algorithms is considered in Section 5. Upper and lower bounds for string sorting problem are presented in Section 6. Conclusion is presented in Section 7.
2 Preliminaries
In the paper we compare strings in the lexicographical order. For two strings and the notation means precedes in the lexicographical order. Let be the length of a string .
2.1 Graph Theory
Let us consider a rooted tree . Let be a set of nodes (vertices), and be a set of edges. Let one fixed node be the root of the tree. Assume that we can obtain it using a procedure .
A path is a sequence of nodes that are connected by edges, i.e. for all . Note, that there are no duplicates among . Here, is the length of the path. We use notation if there is such that . The notation is reasonable, because there is no duplicates in a path. Note that for any two nodes and the path between them is unique because is a tree.
The distance between two nodes and is the path length between them. A height of a node is the distance from the root that is . Let be the tree’s height which is the length of the path between the root and the farthest node.
For each node we can define a parent node , it is a node such that . Additionally, we can define a set of children .
2.2 Quantum query model
In Section 5 we suggest quantum algorithms as applications for our data structures. We have only one quantum subroutine, and the rest part of the algorithm is classical. One of the most popular computation models for quantum algorithms is the query model. We use the standard form of the quantum query model. Let be an variable function. We wish to compute on an input . We are given an oracle access to the input , i.e. it is realized by a specific unitary transformation usually defined as where the register indicates the index of the variable we are querying, is the output register, and is some auxiliary workspace. An algorithm in the query model consists of alternating applications of arbitrary unitaries independent of the input and the query unitary, and a measurement in the end. The smallest number of queries for an algorithm that outputs with probability on all is called the quantum query complexity of the function and is denoted by . We use the running time term instead of query complexity for removing confusion with “query” in the definition of problems in Section 5
3 Main Technique. A Walking Tree
In this section, we present a rooted tree that we call a walking tree that will be a utility data structure for noisy computation for the main data structure.
In the paper, we use it for the following data structures:

Binary Search Tree. We assume that elements comparing procedures have an error with probability .

Segment Tree. We assume that indexes (borders of segments) comparing procedure have an error with probability .
Note that the walking tree is a general technique, and it can be used for other tree data structures.
Let us present the general idea of the tree. The technique is motivated by [14].
Assume that we have a rooted tree . We want to do an operation on the tree that is moving from the root to a specific node of the tree. Assume that we have the following procedures:

returns the root node of the tree .

returns the child node of the node that should be reached from the node .

returns if the node is the last node that should be visited in the operation; and returns otherwise.

ProcessANode(v) processes the node in the required way.

IsANodeCorrect(v) returns if the node should be visited during the the operation; and returns if the vertex is visited because of an error.
Assume that the operation has the following form (Algorithm 1).
Let us consider the operation such that “navigation” procedures (that are GoToTheChild and IsANodeCorrect) can return an answer with an error , where , where .
Our goal is to do the operation with an error . Note that in the general case, can be nonconstant and depend on the number of tree nodes.
Let be the height of the tree. The standard technique is boosting success probability. On each step we repeat GoToTheChild procedure times and choose the most frequent answer. In that case, the error probability of the operation is at most , and the running time of the operation is . Our goal is to have running time.
Let us construct a rooted tree by such that the set of nodes of has a onetoone correspondence to the nodes of and the same with sets of edges. We call a walking tree. Let and be bijections between these two sets. For simplicity we define procedures for similar to the procedures for . Suppose , then
Note that the navigation procedures are noisy (have an error). So, we reduce the error probability to by repetition (using the boosting success probability technique).
Additionally, we associate a counter with a vertex that is a nonnegative integer number. Initially, values of counters for all nodes are , i.e. for each .
We do a random walk by the walking tree . The walk starts from the root node . If we are in a node , then we do the following steps. Firstly, we check the counter’s value . If , then we do the following steps:

Step 1.1. We check whether we are in the node correctly or because of error on the previous steps. We check the value . If the result is , then we go up by changing and stop processing the node. If the node is the root, then we stay in . If the result is , then we go to Step 1.2.

Step 1.2 We check whether we reach the target vertex. We check the value . If the result is , then we increase the counter and stop processing the node. If the result is , then we go to Step 1.3.

Step 1.3. We go to the children .
If , then we do the following step:

Step 2.1. We check whether we are in the node correctly or because of error on the previous steps. The condition means we think that the node is the target node. So, the correctness mean checking . If the result is , then we increase the counter . If the result is , then we decrease the counter .
We can say that the counter is a measure of confidence that is the target node. If , then we think we should continue walking. If , then we think that is the target node. If is big, then we are much more confident that it is the target node than if is small.
The walking process stops in steps that is . The stopping node is the target one. After that we do the operation with the original tree . We store path in , such that , , and is the root node of . Then we process all of them one by one using
One step of the walking process on the walking tree is presented in Algorithm 2 as a procedure that accepts the current node and returns the new node.
The whole algorithm is presented in Algorithm 3
3.1 Analysis
Let us discuss the algorithm and its properties.
On each node, we have two options, we go in the direction of the target node or the opposite direction.
Assume that the counter of the current node is zero. If we are in the wrong branch, then the direction is to the parent node of the current one. If we are in the correct branch, then the direction is to the correct child node of the current one.
Assume that the counter of the current node is nonzero. If we are in the target node, then the correct direction is increasing the counter, and the wrong direction is decreasing the counter. If we are not in the target node, then the correct direction is decreasing the counter, and the wrong direction is increasing the counter.
So, we can be sure that we move in the correct direction with a probability that is at least and in the wrong direction with a probability that is at most .
Let us show that if , then we do the operation on correctly with error probability .
Theorem 3.1
Proof
Let us consider the walking tree. We emulate the counter by replacing it with a nodes chain of length . Formally, for a node we add nodes such that , for . The only child of is for and does not have children.
In that case the increasing of can be emulated by moving from to . The decreasing can be emulated by moving from to . We can assume that is the node itself.
Let be the target node, i.e. . Let us consider the distance between the target node and the current node in the modified tree. The distance
is a random variable. Each step of walk increase or decrease the distance
by . So, we can present , where is the root node of , , and are independent random variables that represent th step and show increasing or decreasing the distance. Let if we move to the correct direction, and if we move to the wrong direction. Note that the probability of moving to the right direction () is at least and the probability of moving to the wrong direction () is at most . From now on without loss of generality we assume that and .If , then we are in the node in the modified tree and in the node in the original walking tree . Note that , where by the definition of the height of a tree. Therefore, means . So, the probability of success of the operation is the probability of the event, i.e. .
Let . We treat as independent binary random variables. Let . For such and for any , the following form of Chernoff bound [41] holds
(1) 
Since then and the inequality (1) becomes
Substituting for we get
From now on without loss of generality we assume that for some . Let and .
In the following steps, we relax the inequality by obtaining less tight bounds for the target probability.
Firstly, we obtain a new lower bound
and hence
Secondly, we obtain a new upper bound
Combining the two obtained bounds we have
and hence
Considering the probability of the opposite event we finally get
4 Noisy Tree Data Structures
Let us apply the technique from Section 3 to different tree data structures.
4.1 Noisy Binary Search Tree
Let us consider a SelfBalanced Search Tree [8]. The data structure is a binary rooted tree . Let be the number of nodes of the tree. We associate a comparable element with a node . We have two properties:

for a node , the element for each node from the left subtree of are less than ; the element for each node from the right subtree of are more than .

the height of the tree .
These properties mean the tree is a Balanced Search Tree. As an implementation of SelfBalanced Search Tree, we use RedBlack Tree [8, 18]. So, the data structure allows us to add and remove a node with a specific value in .
Assume that the comparing procedure of two elements has an error .
Each operation (remove, add and search an element) of the RedBlack tree uses comparing procedures only in the search process. Let us discuss the search operation in the RedBlack tree.
Let us associate two elements and with a node . The elements and are left and right bounds for with respect to the ancestor nodes. Formally,

the element is an ancestor of and is less than . If the set is empty, then , where is some constant element that cannot be found as an element of the tree. We assume that it is less that any other element.

the element is an ancestor of and is more than . If the set is empty, then , where is some constant element that cannot be found as an element of the tree. We assume that it is more that any other element.
Assume that we have a comparing function for elements that returns

if ;

if ;

if .
The comparing function returns the answer with an error for some .
Let us present each of the required procedures for search an object operation

returns the root node of the tree .

returns the left child of if ; and returns the right child if .

returns if ; and returns otherwise.

ProcessANode(v) do nothing.

IsANodeCorrect(v) returns if , formally, and ; and returns otherwise.
The presented operations satisfy all requirements. Let us present the complexity result that directly follows from Theorem 3.1.
Theorem 4.1
Suppose the comparing function for elements of a SelfBalanced Search Tree implemented by the RedBlack Tree is noisy and has an error for some . Then, using the walking tree, we can search, add and remove operations with running time and an error probability .
If we take , then the “noisy” setting does not affect asymptotic complexity.
Corollary 1
Suppose the comparing function for elements of a SelfBalanced Search Tree implemented by the RedBlack Tree is noisy and has an error for some . Then, using the walking tree, we can do searching, adding, and removing operations with running time and an error probability
4.2 Noisy Segment Tree
We consider a standard segment tree data structure [39, 36] for an array for some integer . The segment tree is a full binary tree such that each node corresponds to a segment of the array . If a node corresponds to a segment , then we store a value that represents some information about the segment. Let us consider a function such that . A segment of a node is the union of segments that correspond to its two children. Typically, the children correspond to segments and , for . We consider such that it can be computed by values of two children and , where and are left and right children of . Leaves correspond to single elements of the array . As an example, we can consider integer values and sum as the value in a vertex and a corresponding segment .
The data structure allows us to invoke the following requests in running time.

Update. Parameters are an index and an element (). The procedure assigns . For this goal, we assign for the leaf that corresponds to the and update ancestors of .

Request. Parameters are two indexes and (), the procedure computes .
The main part of both operations is the following. For the given root node and an index , we should find the leaf node corresponding to . The main step is the following. If we are in a node with associated segment , then we compare with a middle element and choose the left or the right child.
Assume that we have a comparing function for indexes that returns

if ;

if ;

if .
The comparing function returns the answer with an error for some .
Let us present each of the required procedures for searching the leaf with index in a segment tree .

returns the root node of the segment tree .

returns the left child of if ; and returns the right child if for , and the segment associated with .

returns if , formally, and ; and returns otherwise. Here the segment is associated with .

ProcessANode(v) recomputes according to the values of in the left and the right children.

IsANodeCorrect(v) returns if , formally, and ; and returns otherwise. Here the segment is associated with .
The resented operations satisfy all requirements. Let us present the complexity result that directly follows from Theorem 3.1.
Theorem 4.2
Suppose, the comparing function for indexes of a segment tree is noisy and has an error for some . Then, using the walking tree, we can do update and request operations with running time with an error probability .
If we take , then the “noisy” setting does not affect asymptotic complexity.
Corollary 2
Suppose, the comparing function for indexes of a segment tree is noisy and has an error for some . Then, using the walking tree, we can do update and request operations with running time with an error probability
4.3 Analysis, Discussion, Modifications
There are different additional operations with a segment tree. One such example is the segment tree with range updates. In this modification, we can update the values in a range. The reader can find more information in [36] and examples of applications in [31, 30]. The main operation with a noisy comparing procedure is the same. So, we can still use the same idea for such modifications of the segment tree.
Remark 1
If the segment tree is constructed for an array , then we can extend it to , where that is closest to power of and are neutral element for the function . If we have a vertex and two borders and of the segment associated with , then we always can compute the segments for the left and the right children that are and for . Additionally, we can compute the segment for the parent that is , where if the node is the left child of its parent. If the node is the right child of its parent, then the parent’s segment is , where . Therefore, we should not store the borders of a segment in a node, and we can compute them during the walk on the segment tree. Additionally, we should not construct the walking tree. We can keep it in mind and walk by the segment tree itself using only three variables: the and borders of the current segment and a counter if required.
If we have access to the full segment tree, including leaves, then we can do operations without the walking tree. We can use the noisy binary search algorithm [14] for searching the leaf that corresponds to the index and then process all the ancestors of the leaf.
There are at least two useful scenarios for a noisy segment tree.

The first one is as follows. We have access only to the root and have no direct access to leaves.

The second one is the compressed segment tree. If initially, all elements of the array are empty or neutral for the function , then we can compress a subtree with one node with a label of a segment with empty elements. It is reasonable if is very big and storing the whole tree is very expensive. In that case, we can replace the noisy binary tree with the noisy selfbalanced search tree from the previous section. The search tree stores the updated elements in leaves, and we can search the required index in this data structure. At the same time, the noisy segment tree uses much less memory with respect to the Remark 1. That is why noisy segment tree is more effective in this case too.
5 Applications
As one of the interesting applications, we suggest applications from quantum computing [42, 3, 1]. As objects with noisy comparing we use strings. There are algorithm for two strings and that compares them in running time [26, 28, 23, 32, 15], where and are lengths of and respectively. The algorithm is based on a modification [35, 37, 38, 23] of Gorver’s search algorithm [17, 6]. The result is the following
Lemma 1 ([26, 28, 23, 32, 15])
There is a quantum algorithm that compares two strings and of lengths and in the lexicographical order with running time and error probability for .
Let us take and use the procedure as a string comparing procedure. We consider an application of the noisy search tree in Section 5.1 and an application of the noisy segment tree in Section 5.2.
5.1 Exam Problem as an Application of Noisy SelfBalanced Binary Search Tree
Problem.
Assume that we have a group of students that came to an exam. Each moment we have an event of one of three types:

A professor checks the student with the name that is minimal in the lexicographical order. After that, the student leaves the exam. In that case, we should return the name of this student.

A new student comes to the group. We accept the name of the new student as input.

A student understands that the exam is too hard and leaves the group. We accept the name of this student as input.
Formally, we have queries that are pairs , where is the type of a query and is the input data for the query. A query can be one of tree types:

Return the minimal in lexicographical order string and remove it from the group. We assume that is empty in this case.

Add a string to the group.

Remove a string from the group or ignore the query if there is no such a string.
So, we solve the problem using the SelfBalanced Search Tree. We store strings in the tree. In fact, we store indexes of strings in nodes, and we assume that if a node stores an index , then any node from the left subtree has an index such that ; and any node from the right subtree has an index such that .
We implement the queries in the following way:

We return the string such that is stored in the most left node; and remove the string from the tree.

We add the new string to the tree

We remove the requested string from the tree.
For simplicity, assume that all strings have the same length .
The string comparing algorithm is quantum (Lemma 1) and has probabilistic behavior, so the selfbalanced binary search tree is noisy. The algorithm for processing queries has the following complexity.
Theorem 5.1
The quantum algorithm with noisy selfbalanced binary search tree for Exam Problem has running time and error probability .
Proof
The comparing procedure has error probability and running time due to Lemma 1. We use the noisy selfbalanced binary search tree for error probability . So the main operations has running time and error probability due to Theorem 4.1 and Corollary 1. We do add or remove operations. The total running time is . We have independent events of error. So, the total error probability is at most .
5.2 The Largest File Problem as an Application of Noisy Segment Tree
Problem.
Assume that we have a file system where a file has a string address. Each moment we have an event of one of three types:

There is not enough space in our data storage, and we want to obtain the largest file in some range of addresses. The range is given by and strings. We interesting in the largest file that have an address such that .

A new file is added to the file system. We accept the address and the size of the new file as input.

A file is removed from the file system. We accept the address of the removing file as input.
Formally, we have queries that are pairs , where is the type of a query ; strings , and an integer positive are the input data for the query. A query can be one of tree types:

Return the size, address and the size of the largest file such that . We assume that is empty in this case.

Add a file of size and address to the file system. If a file with the address exists, then we replace it. We assume that is empty in this case.

Add a file with an address from the file system. If there is no such file, then we ignore it. We assume that and are empty in this case.
We assume that all address strings and have length or empty.
So, we solve the problem using the compressed segment tree. We store sizes of files as elements of an array and use addresses as indexes. The range of possible addresses is too large, and that is why we use compressed segment tree. As a function , we use the maximum function. As a neutral element, we use the value that is definitely less than any other realworld file size.
Initially, we have only the root of the segment tree. We implement the queries in the following way:

We return the index and the value of an element such that .

We update the element .

We update the element . Note that is the neutral element for our function .
The string comparing (indexes comparing) algorithm is quantum (Lemma 1) and has probabilistic behavior, so the segment tree is noisy. The algorithm for processing queries has the following complexity.
Theorem 5.2
The quantum algorithm with noisy segment tree for Largest File Problem has running time and error probability .
Proof
The indexes comparing procedure has error probability and running time due to Lemma 1. We use the noisy segment tree for error probability . So the main operations has running time and error probability due to Theorem 4.2 and Corollary 2. We do update and request operations. The total running time is . We have independent events of error. So, the total error probability is at most .
6 Quantum Sort Algorithm for Strings
Problem.
There are strings of size for some positive integers and . The problem is to find a permutation such that , or and for each .
Quantum sorting algorithm for strings was presented in [26, 28]. The running time of the algorithm is .
We can present the algorithm with the same complexity, but in simpler way. Assume that we have a noisy selfbalanced binary search tree with strings as keys and quantum comparing procedure from Lemma 1. We assume that the comparing procedure compares indexes in a case of equal strings.
In fact, we store indexes of the strings in nodes like in Section 5.1.
Initially the tree is empty. Let be a function that adds a string to the tree. Let GetMin be a function that returns the index of the minimal string from the tree according to the comparing procedure. After returning the index, the function removes it from the tree.
The final algorithm is presented as Algorithm 4
We have the result for the problem
Theorem 6.1
The quantum running time for sorting string of size is and .
The upper bound is complexity of presented algorithm and algorithm from [28]. The proof of the lower bound is presented in the next section. Reader can see that the difference between upper and lower bounds is just . It shows that the presented algorithm is almost optimal.
6.1 Lower Bound
6.1.1 Permutations and Sorting
Let us formally define a function for sorting.
Definition 1 (Radix sort function)
Let be positive natural numbers.
Let be a function that gets words of zeroes and ones as its input and returns a permutation of integers that is a result of sorting input strings.
Let be input words. Then
where is a permutation, and for every corresponding input words are sorted: . If two input words are equal, then we sort them according to their position in the input list: implies that .
Note that in the case of the function can be used to compute the majority function. We use to sort words and the th word is a value of the majority function. Therefore, we expect that complexity of should be . In the case of the function is similar to a search of the first one, so we expect that it requires queries.
We use notation for the set . We denote the set of all permutations of numbers from .
Each permutation can be represented as a product of transpositions (cycles with only two elements). Such decomposition is not unique, but the parity of the number of transpositions in all decompositions is the same. Let us define the sign of the permutation as
Note that usually the sign of the permutation is defined to be (cf. [16]), but these definitions are related by .
We denote the spectral norm of the matrix , and denotes the Hadamard product of matrices and : .
6.1.2 Adversary Bound
We prove a lower bound for using Adversary method
Theorem 6.2 (Adversary bound, [19])
Let be an arbitrary function, where is a set of outputs.
Let be an arbitrary matrix with rows and columns indexed by input strings, such that if .
Let be a zeroone matrix, such that if and otherwise.
Let us denote
where the maximum is taken over all matrices with nonnegative entries.
Then the twosided bounded error quantum query complexity is lower bounded by
One nice property of Adversary method is that bounds obtained by using Adversary method can be composed.
Corollary 3 ([19])
If , subfunctions act on disjoint subsets of input, and every , then
To obtain an upper bound on , the following lemma is helpful.
6.1.3 Query Complexity of Radix Sort
6.1.4 Computing Sign of Radix Sort
First, let us note that sorting a list cannot be easier than computing sign of permutation that sorts that list.
Definition 2
Let be positive natural numbers.
Let be a function that gets words of <