1 Introduction
Query complexity, also referred to as decision tree complexity, is one of the most basic models of computation. We aim at learning an unknown object (a secret) by asking queries of a certain type. The cost of the computation is the number of queries made until the secret is unveiled. All other computation is free. Query complexity is one of the standard measures in computational complexity theory.
A related performance measure can be found in the theory of blackbox optimization, where we are asked to optimize a function without having access to it other than by evaluating (“querying”) the function values for solution candidates . In blackbox optimization, the performance of an algorithm on a class of functions is measured by the number of function evaluations that are needed until, for any unknown function , an optimal search point is evaluated for the first time.
1.1 Problem Description and Summary of Main Results
In this work, we consider the query complexity of the following problem.
Let denote the set of permutations of ; let . Our problem is that of learning a hidden permutation together with a hidden bitstring through queries of the following type. A query is again a bitstring . As answer we receive the length of the longest common prefix of and in the order of , which we denote by
We call this problem the HiddenPermutation problem. It can be viewed as a guessing game like the wellknown Mastermind problem (cf. Section 1.2); however, the secret is now a pair and not just a string. The problem is a standard benchmark problem in blackbox optimization.
It is easy to see (see Section 2) that learning one part of the secret (either or ) is no easier (up to questions) than learning the full secret. It is also not difficult to see that queries suffice deterministically to unveil the secret (see Section 3). Doerr and Winzen [DW12] showed that randomization allows to beat this bound. They gave a randomized algorithm with expected complexity. The informationtheoretic lower bound is only as the answer to each query is a number between zero and and hence may reveal as many as bits. We show that

the deterministic query complexity is , cf. Section 3, and that
Both upper bound strategies are efficient, i.e., they can be implemented in polynomial time. The lower bound is established by a (standard) adversary argument in the deterministic case and by a potential function argument in the randomized case. We remark that for many related problems (e.g., sorting, Mastermind, and many coin weighing problems) the asymptotic query complexity, or the best known lower bound for it, equals the informationtheoretic lower bound. For our problem, deterministic and randomized query complexity differ, and the randomized query complexity exceeds the information theoretic lower bound. The randomized upper and lower bound require nontrivial arguments. In Section 2 we derive auxiliary results, of which some are interesting in their own right. For example, it can be decided efficiently whether a sequence of queries and answers is consistent.
A summary of the results presented in this work previously appeared in [AAD13].
1.2 Origin of the Problem and Related Work
The HiddenPermutation problem can be seen as a guessing game. An archetypal and wellstudied representative of guessing games is Mastermind, a classic twoplayer board game from the Seventies. In the Mastermind game, one player chooses a secret code . The role of the second player is to identify this code using as few guesses as possible. For each guess she receives the number of positions in which and agree (and in some variants also the number of additional “colors” which appear in both and , but in different positions).
The Mastermind game has been studied in different context. Among the most prominent works is a study of Erdős and Rényi [ER63], where the 2color variant of Mastermind is studied in the context of a coinweighing problem. Their main result showed that the query complexity of this game is . This bound has subsequently been generalized in various ways, most notably to the Mastermind game with colors. Using similar techniques as Erdős and Rényi, Chvátal [Chv83] showed an upper bound of for the query complexity of this game. This bound has been improved [DDST16]. The best known lower bound is the informationtheoretic linear one. This problem is open for more than 30 years.
Our permutationbased variant of Mastermind has its origins in the field of evolutionary computation. There, the
LeadingOnes function counting the number of initial ones in a binary string of length , is a commonly used benchmark function both for experimental and theoretical analyses (e.g., [Rud97]). It is studied as one of the simplest examples of a unimodal nonseparable function.An often desired feature of generalpurpose optimization algorithms like genetic and evolutionary algorithms is that they should be oblivious of problem representation. In the context of optimizing functions of the form
, this is often formalized as an unbiasedness restriction, which requires that the performance of an unbiased algorithm is identical for all composed functions of with an automorphism of the dimensional hypercube. Most standard evolutionary algorithms as well many other commonly used blackbox heuristics like local search variants, simulated annealing, etc. are unbiased. Most of them query an average number of solution candidates until they find the maximum of LeadingOnes, see, e.g., [DJW02].It is not difficult to see that
showing that HiddenPermutation generalizes the LeadingOnes function by indexing the bits not from left to right, but by an arbitrary permutation, and by swapping the interpretation of and for indices with .
The first to study the query complexity of HiddenPermutation were Droste, Jansen, and Wegener [DJW06], who introduced the notion of blackbox complexity as a measure for the difficulty of blackbox optimization problems. As mentioned, the main objective in blackbox optimization is the identification of an optimal solution using as few function evaluations as possible. Denoting by the number of queries needed by algorithm until it queries for the first time a search point (this is the socalled runtime or optimization time of ), the blackbox complexity of a collection of functions is
the best (among all algorithms) worstcase (with respect to all ) expected runtime. Blackbox complexity is essentially querycomplexity, with a focus on optimization rather than on learning. A survey on this topic can be found in [Doe18].
Droste et al. [DJW06] considered only the /invariant class (where denotes the identity permutation). They showed that the blackbox complexity of this function class is .
The first bound for the blackbox complexity of the full class HiddenPermutation was presented in [LW12] by Lehre and Witt, who showed that any socalled unary unbiased blackbox algorithm needs steps, on average, to solve the HiddenPermutation problem. In [DJK11] it was then shown that already with binary distributions one can imitate a binary search algorithm (similar to the one presented in the proof of Theorem 3.1), thus yielding an blackbox algorithm for the HiddenPermutation problem. The algorithm achieving the bound mentioned in Section 1.1 can be implemented as a ternary unbiased one [DW12]. This bound, as mentioned above, was the previously best known upper bound for the blackbox complexity of HiddenPermutation.
In terms of lower bounds, the best known bound was the linear informationtheoretic one. However, more recently a stronger lower bound has been proven to hold for a restricted class of algorithms. More precisely, Doerr and Lengler [DL18] recently proved that the socalled elitist blackbox complexity is , showing that any algorithm beating this bound has to keep in its memory more than one previously evaluated search point or has to make use of information hidden in nonoptimal search points. It is conjectured in [DL18] that the (1+1) memoryrestriction alone (which allows an algorithm to store only one previously queried search point in its memory and to evaluate in each iteration only one new solution candidate) already causes the bound, but this conjecture stands open. We note that many local search variants, including simulated annealing, are memoryrestricted.
2 Preliminaries
For all positive integers we define and . By we denote the
th unit vector
of length . For a set we define , where denotes the bitwise exclusiveor. We say for two bitstrings that we created from by flipping or that we created from by flipping the entries in position(s) if . By we denote the set of all permutations of . For , let . and . To increase readability, we sometimes omit the signs; that is, whenever we write where an integer is required, we implicitly mean .Let . For and we define
and are called the target string and target permutation of , respectively. We want to identify target string and permutation by asking queries , , , …, and evaluating the answers (“scores”) . We may stop after queries if there is only a single pair with for .
A deterministic strategy for the HiddenPermutation problem is a tree of outdegree in which a query in is associated with every node of the tree. The search starts as the root. If the search reaches a node, the query associated with the node is asked, and the search proceeds to the child selected by the score. The complexity of a strategy on input is the number of queries required to identify the secret, and the complexity of a deterministic strategy is the worstcase complexity of any input. That is, the complexity is the height of the search tree.
A randomized strategy
is a probability distribution over deterministic strategies. The complexity of a randomized strategy on input
is the expected number of queries required to identify the secret, and the complexity of a randomized strategy is the worstcase complexity of any input. The probability distribution used for our randomized upper bound is a product distribution in the following sense: a probability distribution over is associated with every node of the tree. The search starts as the root. In any node, the query is selected according to the probability distribution associated with the node, and the search proceeds to the child selected by the score.We remark that knowing allows us to determine with queries , . Observe that equals . Conversely, knowing the target permutation we can identify in a linear number of guesses. If our query has a score of , all we need to do next is to query the string that is created from by flipping the entry in position . Thus, learning one part of the secret is no easier (up to questions) than learning the full.
A simple informationtheoretic argument gives an lower bound for the deterministic query complexity and, together with Yao’s minimax principle [Yao77], also for the randomized complexity. The search space has size , since the unknown secret is an element of . A deterministic strategy is a tree with outdegree and leaves. The maximal and average depth of any such tree is .
Let be a sequence of queries and scores . We call a guessing history. A secret is consistent with if for all . is feasible if there exists a secret consistent with it.
An observation crucial in our proofs is the fact that a vector of subsets of , together with a top score query , captures the total knowledge provided by a guessing history about the set of secrets consistent with . We will call the candidate set for position ; will contain all indices for which the following simple rules (1) to (3) do not rule out that equals . Put differently, contains the set of values that are still possible images for , given the previous queries.
Theorem 2.1.
Let , and let be a guessing history. Construct the candidate sets according to the following rules:

If there are and with and , then .

If there are and with and , then .

If there are and with and , then .

If is not excluded by one of the rules above, then .
Furthermore, let and let for some with .
Then a pair is consistent with if and only if

and

for all .
Proof.
Let satisfy conditions (a) and (b). We show that is consistent with . To this end, let , let , , and . We need to show .
Assume . Then . Since , this together with (a) implies . Rule (1) yields ; a contradiction to (b).
Similarly, if we assume , then . We distinguish two cases. If , then by condition (a) we have . By rule (3) this implies ; a contradiction to (b).
On the other hand, if , then by (a). Rule (2) implies , again contradicting (b).
Necessity is trivial. ∎
We may also construct the sets incrementally. The following update rules are direct consequences of Theorem 2.1. In the beginning, let , . After the first query, record the first query as and its score as . For all subsequent queries, do the following: Let be the set of indices in which the current query and the current best query agree. Let be the score of and let be the score of .

If , then for and .

If , then for .

If , then for and . We further replace and .
It is immediate from the update rules that the sets form a laminar family; i.e., for either or . As a consequence of Theorem 2.1 we obtain a polynomial time test for the feasibility of histories. It gives additional insight in the meaning of the candidate sets .
Theorem 2.2.
It is decidable in polynomial time whether a guessing history is feasible. Furthermore, we can efficiently compute the number of pairs consistent with it.
Proof.
We first show that feasibility can be checked in polynomial time. Let be given. Construct the sets as described in Theorem 2.1. Now construct a bipartite graph with node set on both sides. Connect to all nodes in on the other side. Permutations with for all are in onetoone correspondence to perfect matchings in this graph. We recall that perfect matchings can be computed efficiently in bipartite graphs, e.g., using the FordFulkerson algorithm, which essentially treats the matching problem as a network flow problem. If there is no perfect matching, the history in infeasible. Otherwise, let be any permutation with for all . We next construct . We use the obvious rules:

If and for some then set .

If and for some then set .

If is not defined by one of the rules above, set it to an arbitrary value.
We need to show that these rules do not lead to a contradiction. Assume otherwise. There are three ways, in which we could get into a contradiction. There is some and some

setting to opposite values by rule (a)

setting to opposite values by rule (b)

setting to opposite values by rules (b) applied to and rule (a) applied to .
In each case, we readily derive a contradiction. In the first case, we have , and . Thus by rule (1). In the second case, we have and . Thus by (2). In the third case, we have , , and . Thus by (3).
Finally, the pair defined in this way is clearly consistent with the history.
Next we show how to efficiently compute the number of consistent pairs. We recall Hall’s condition for the existence of a perfect matching in a bipartite graph. A perfect matching exists if and only if for every . According to Theorem 2.1, the guessing history can be equivalently described by a state . How many pairs are compatible with this state?
Once we have chosen , there are exactly different choices for if and exactly one choice if . The permutations can be chosen in a greedy fashion. We fix in this order. The number of choices for equals minus the number of , , lying in . If is disjoint from , never lies in and if is contained in , is always contained in . Thus the number of permutations is equal to
It is easy to see that the greedy strategy does not violate Hall’s condition. ∎
We note that it is important in the proof of Theorem 2.2 that the sets form a laminar family; counting the number of perfect matchings in a general bipartite graph is #Pcomplete [Val79].
From the proof can also derive which values in are actually possible as value for . This will be described in the next paragraph, which a reader only interested in the main results can skip without loss. A value is feasible if there is a perfect matching in the graph containing the edge . The existence of such a matching can be decided in polynomial time; we only need to test for a perfect matching in the graph . Hall’s condition says that there is no such perfect matching if there is a set such that . Since contains a perfect matching (assuming a consistent history), this implies ; i.e., is tight for Hall’s condition. We have now shown: Let . Then is infeasible for if and only if there is a tight set with and . Since the form a laminar family, minimal tight sets have a special form; they consist of an and all such that is contained in . In the counting formula for the number of permutations such are characterized by . In this situation, the elements of are infeasible for all with . We may subtract from each with .
If Hall’s condition is tight for some , i.e., , we can easily learn how operates on . We have for and hence the largest index in is at most . Perturb by flipping each bit in exactly once. The scores determine the permutation.
3 Deterministic Complexity
We settle the deterministic query complexity of the HiddenPermutation problem. The upper and lower bound match up to a small constant factor. Specifically, we prove
Theorem 3.1.
The deterministic query complexity of the HiddenPermutation problem with positions is .
Proof.
The upper bound is achieved by an algorithm that resembles binary search and iteratively identifies and the corresponding bit values : We start by setting the set of candidates for to and by determinining a string with score 0; either the allzerostring or the allonestring will work. We iteratively reduce the size of keeping the invariant that . We select an arbitrary subset of of size and create from by flipping the bits in . If , , if If , . In either case, we essentially halve the size of the candidate set. Continuing in this way, we determine in queries. Once and are known, we iterate this strategy on the remaining bit positions to determine and , and so on, yielding an query strategy for identifying the secret. The details are given in Algorithm LABEL:alg:nlogn.
algocf[t]
The lower bound is proved by examining the decision tree of the deterministic query scheme and exhibiting an input for which the number of queries asked is high. More precisely, we show that for every deterministic strategy, there exists an input such that after queries the maximal score ever returned is at most . This is done by a simple adversarial argument: First consider the root node of the decision tree. Let be the first query. We proceed to the child corresponding to score 1. According to the rules from the preceding section to are initialized to . Let be the next query asked by the algorithm and let be the set of indices in which and agree.

If we would proceed to the child corresponding to score , then would become and would not not change according to Rule 1.

If we would proceed to the child corresponding to score , then would become and would become according to Rule 2.
We proceed to the child where the size of at most halves. Observe that always, and the maximum score is . Moreover, to are not affected.
We continue in this way until . Let be the vertex of the decision tree reached. Query is still the query with maximum score. We choose and arbitrarily and consider the subset of all inputs for which , , , and . For all such inputs, the query path followed in the decision tree descends from the root to the node . For this collection of inputs, observe that there is one input for every assignment of values to different from and , and for every assignment of values to . Hence we can recurse on this subset of inputs starting at ignoring , and . The setup is identical to what we started out with at the root, with the problem size decreased by 2. We proceed this way, forcing queries for every two positions revealed, until we have returned a score of for the first time. At this point, we have forced at least queries. ∎
4 The Randomized Strategy
We now show that the randomized query complexity is only . The randomized strategy overcomes the sequential learning process of the binary search strategy (typically revealing a constant amount of information per query) and instead has a typical information gain of bits per query. In the language of the candidate sets , we manage to reduce the sizes of many of these sets in parallel, that is, we gain information on several values despite the seemingly sequential way the function offers information. The key to this is using partial information given by the (that is, information that does not determine , but only restricts it) to guess with good probability an with .
Theorem 4.1.
The randomized query complexity of the HiddenPermutation problem with positions is .
The strategy has two parts. In the first part, we identify the positions and the corresponding bit values for some with queries. In the second part, we find the remaining positions and entries using the binary search algorithm with queries per position.
4.1 A High Level View of the First Part
We give a high level view of the first part of our randomized strategy. Here and in the following we denote by the current best score, and by we denote a corresponding query; i.e., . For brevity, we write for .
The goal of any strategy must be to increase and to gain more information about by reducing the sets . Our strategy carefully balances the two subgoals. If is “large”, it concentrates on reducing sets, if is “small”, it concentrates on increasing . The latter will simulatenously reduce .
We arrange the candidate sets to into levels 0 to , where . Initially, all candidate sets are on level , and we have for all . The sets in level have larger index than the sets in level . Level contains an initial segment of candidate sets, and all candidate sets on level are singletons, i.e., we have identified the corresponding value. On level , , we can have up to sets. We also say that the capacity of level is . The size of any set on level is at most , where is any constant greater than or equal to . We choose , for and maximal such that . Depending on the status (i.e., the fill rate) of these levels, either we try to increase , or we aim at reducing the sizes of the candidate sets.
The algorithm maintains a counter and strings with . The following invariants hold for the candidate sets to :

for all .

The sets , , are pairwise disjoint.

for .

is random. More precisely, there is a set such that and is a uniformly random subset of of size .
algocf[h!]
Our first goal is to increase to and to move the sets to the first level, i.e., to decrease their size to . This is done sequentially. We start by querying and , where is arbitrary and is the bitwise complement of . By swapping and if needed, we may assume . We now run a randomized binary search for finding . We choose uniformly at random a subset ( in the beginning) of size . We query where is obtained from by flipping the bits in . If , we set ; we set otherwise. This ensures and invariant (4). We stop this binary search once is sufficiently likely; the analysis will show that (and hence ) for some large enough constant is a good choice.
We next try to increase to a value larger than one and to simultaneously decrease the size of . Let . If , one of and is one and the other is larger than one. Swapping and if necessary, we may assume . We use randomized binary search to reduce the size of to . The randomized binary search is similar to before. Initially, is equal to . At each step we chose a subset of size and we create from by flipping the bits in positions . If we update to and we update to otherwise. We stop once .
At this point we have and . We hope that , in which case we can increase to three and move set from level to level by random binary search (the case is called a failure and will be treated separately at the end of this overview).
At some point the probability that drops below a certain threshold and we cannot ensure to make progress anymore by simply querying . This situation is reached when and hence we abandon the previously described strategy once . At this point, we move our focus from increasing to reducing the size of the candidate sets , thus adding them to the second level. More precisely, we reduce their sizes to at most . This reduction is carried out by SizeReduction, which we describe in Section 4.3. It reduces the sizes of the up to candidate sets from some value to the target size of level with an expected number of queries.
Once the sizes have been reduced to at most , we move our focus back to increasing . The probability that will now be small enough (details below), and we proceed as before by flipping in the entries in the positions and reducing the size of to . Again we iterate this process until the first level is filled; i.e., until we have . As we did with , we reduce the sizes of to , thus adding them to the second level. We iterate this process of moving sets from level to level and then moving them to the second level until sets have been added to the second level. At this point the second level has reached its capacity and we proceed by reducing the sizes of to at most , thus adding them to the third level.
In total we have levels. For , the th level has a capacity of sets, each of which is required to be of size at most . Once level has reached its capacity, we reduce the size of the sets on the th level to at most , thus moving them from level to level . When sets have been added to the last level, level , we finally reduce their sizes to one. This corresponds to determining for each .
Failures:
We say that a failure happens if we want to move some set from level 0 to level 1, but . In case of a failure, we immediately stop our attempt of increasing . Rather, we abort the first level and move all sets on the first level to the second one. As before, this is done by calls to SizeReduction which reduce the size of the sets from at most to at most . We test whether we now have . Should we still have , we continue by moving all level sets to level , and so on, until we finally have . At this point, we proceed again by moving sets from level to level , starting of course with set . The condition will certainly be fulfilled once we have moved to to level , i.e., have reduced them to singletons.
Part 2:
In the second part of Algorithm LABEL:alg:upper we determine the last entries of and . This can be done as follows. When we leave the first phase of Algorithm LABEL:alg:upper, we have and . We can now proceed as in deterministic algorithm (Algorithm LABEL:alg:nlogn) and identify each of the remaining entries with queries. Thus the total number of queries in Part 2 is linear.
4.2 Random Binary Search
RandBinSearch is called by the function . It reduces the size of a candidate set from some value to some value in queries.
algocf[h!]
Lemma 4.2.
Let with and let be any set with and for . Let and with . Algorithm LABEL:alg:subroutine1 reduces the size of to using at most queries.
Proof.
Since , we have for all and . Also and for . Therefore, either we have in line LABEL:lineyprime1 or we have . In the former case, the bit was flipped, and hence must hold. In the latter case the bit in position bit was not flipped and we infer .
The runtime bound follows from the fact that the size of the set halves in each iteration. ∎
We call RandBinSearch in Algorithm LABEL:alg:upper (line LABEL:querysub1) to reduce the size of to , or, put differently, to reduce the number of candidates for to . As the initial size of is at most , this requires at most queries by Lemma 4.2.
Lemma 4.3.
A call of in Algorithm LABEL:alg:upper requires at most queries.
Proof.
The two occasions where queries are made in Algorithm LABEL:alg:upper are in line LABEL:query1 and in line LABEL:querysub1. Line LABEL:query1 is executed at most times, each time causing exactly one query. Each call to RandBinSearch in line LABEL:querysub1 causes at most queries, and RandBinSearch is called at most times. ∎
4.3 Size Reduction
We describe the second subroutine of Algorithm LABEL:alg:upper, SizeReduction. This routine is used to reduce the sizes of the up to candidate sets returned by a recursive call from some value to at most the target size of level , which is . As we shall see below, this requires an expected number of queries. The pseudocode of SizeReduction is given in Algorithm LABEL:alg:SizeReduction. It repeatedly calls a subroutine ReductionStep that reduces the sizes of at most candidate sets to a th fraction of their original size using at most queries, where is a parameter. We use ReductionStep with parameter repeatedly to achieve the full reduction of the sizes to at most .
algocf[h]
ReductionStep is given a set of at most indices and a string with . The goal is to reduce the size of each candidate set , , below a target size where for all . The routine works in phases of several iterations each. Let be the set of indices of the candidate sets that are still above the target size at the beginning of an iteration. For each , we randomly choose a subset of size . We create a new bitstring from by flipping the entries in positions . Since the sets , , are pairwise disjoint, we have either or for some . In the first case, i.e., if , none of the sets was hit, and for all we can remove the subset from the candidate set . We call such queries “offtrials”. An offtrial reduces the size of all sets , , to a th fraction of their original size. If, on the other hand, we have for some , we can replace by set as must hold. Since by assumption, this set has now been reduced to its target size and we can remove it from .
We continue in this way until at least half of the indices are removed from and at least offtrials occurred, for some constant satisfying . We then proceed to the next phase. Consider any that is still in . The size of was reduced by a factor at least times. Thus its size was reduced to at most half its original size. We may thus halve without destroying the invariant for . The effect of halving is that the relative size of the sets will be doubled for the sets that still take part in the reduction process.
algocf[h!]
Lemma 4.4.
Proof.
Let be some constant. We show below that—for a suitable choice of —after an expected number of at most queries both conditions in line LABEL:line:condition2 are satisfied. Assuming this to hold, we can bound the total expected number of queries until the size of each of the sets has been reduced to by
as desired.
In each iteration of the repeatloop we either hit an index in and hence remove it from or we have an offtrial. The probability of an offtrial is at least since always. Thus the probability of an offtrial is at least and hence the condition holds after an expected number of iterations.
As long as , the probability of an offtrial is at most and hence the probability that a set is hit is at least . Since we have and hence . Thus the expected number of iterations to achieve hits is .
If a candidate set is hit in the repeatloop, its size is reduced to . By assumption, this is bounded by . If is never hit, its size is reduced at least times by a factor . By choice of , its size at the end of the phase is therefore at most half of its original size. Thus after replacing by we still have for . ∎
It is now easy to determine the complexity of SizeReduction.
Comments
There are no comments yet.