1 Problem description and preliminaries
We consider a square finite integer grid parameterized by with
where . There is a classification function that respects Pareto domination [mas1995microeconomic]:
Pareto domination We say a tuple Pareto-dominates a tuple if .
Pareto frontier Given a set with tuple elements of the form , we say is the Pareto frontier of if every element in is Pareto-dominated by an element of and no element of is Pareto-dominated by an element of .
Since respects Pareto domination, it is uniquely defined by a Pareto frontier of some set .
There is a 1-1 mapping between classification functions and Pareto frontiers in .
Every function has a 1-1 mapping to the set . The set has a 1-1 mapping to its Pareto frontier . Composing the two we get the result. ∎
We wish to learn the function . Our query model assumes that we can at each step choose some element of and observe . We show two simple upper bounds over learning :
We query each element in , and . ∎
For this lemma we consider the notion of a clash.
For a row for some fixed , a clash is some integer value so that and .
We can find a clash in
We implement a binary search over the query model: If we query some tuple and get , we go “up”, and if we get we go “down”. We stop when we get and we know returned (or ), or we get and know returned (or ). ∎
Given a clash point for a row we know values for any .
Immediate from Pareto domination. ∎
We can now prove Lemma 3.
For every row , we find a clash point. We thus know values for all elements in the row. Since we do it for all rows, we know . The run time for each row is and over rows this is . ∎
2 Tight bounds
The above simple algorithms do not establish a tight upper bound for the problem.
Learning is .
We use the 1-1 mapping of Lemma 1. Since we learn at most bit of information with each query, of the number of Pareto frontiers for is a lower bound for any algorithm. We show an injection from permutations over values of and values of (overall values) into the Pareto frontiers of . For example, are all the permutations for .
There is an injection from permutations over values of and values of (which is of size ) into the Pareto frontiers in .
Let be some permutation as described. We inject it into a series of clash points of length (one for each row of ). Let be a running index over the permutation, which we initialize with . Let , i.e., for any element in the first row value is . Let be the number of values in the permutation starting from . Let . Since there are values in , and each is immediately after the such value in the permutation, we have that , i.e., it marks the end of the permutation. Thus , as there are values in the permutation and all of them are covered by .
The series of clash points is weakly monotone decreasing and valid in the sense that each clash point is between and . It can be shown that every such series of clash points defines a unique Pareto frontier in . ∎
Using this injection, we are now able to prove the lower bound. Asymptotically, the central binomial coefficient [oeis] using Stirling’s approximation [feller1967direct]. Taking , we get , as required. ∎
We now present an asymptotically optimal algorithm that finds a series of clashes in , establishing a tight bound. As we know the series of clashes uniquely defines a Pareto frontier in , which defines a function .
There is an algorithm to find a Pareto frontier in .
Assume for simplicity that is an integer number.
The algorithm runs over all rows, and thus finds all clashes and the Pareto frontier.
We examine the aggregate run time for all calls of the algorithm with the same depth . There are such calls. Notice that the calls are bounded to subsequent intervals that together cover . Each call finds a clash in the interval, and so is . By Jensen inequality [durrett2019probability], .
Summing over all possible depths, we get
We show by induction over that this sum is exactly , and thus establish an run time overall for the algorithm.
For both sides are . Now,
which is exactly the expression we expect for (i.e., ). ∎
The motivation of the problem is for dynamic programming. Say that there is some recursive rule over tuples that defines, given a current valid set , the next valid set . We also know that the recursive rule respects Pareto domination. In particular, the problem of finding an optimal deterministic -realizable online learning algorithm [shalev2014understanding] for label set
takes this form, but it should be pretty useful in dynamic programming and game theory in general.
An obvious generalization would be to -tuples, where much of the discussion extends naturally. The interesting question would be the run time of the algorithm. Since for we get the binary search in , and for we get the algorithm we presented in , it is not clear what should be the formula for general . We are however guaranteed an upper bound of by simply iterating over all possible values for the first coordinates and applying the two dimensional result.