DeepAI

# Learning Optimal Search Algorithms from Data

Classical algorithm design is geared towards worst case instances and fails to exploit structure that may be present in typical instances. Is it possible to learn this structure from examples and exploit it algorithmically? We study this question in the simplest algorithmic context – search for a cheap solution within an unstructured space. This setting captures, for example, search for a short path to drive to work when only some routes may ever be relevant to consider, or shopping online when there may only be a handful of stores that offer the best prices. We propose a framework for learning optimal search algorithms from data that captures the tradeoff between the cost of the solution and the time to find it. We consider a setting with n alternatives each having an unknown cost that can be revealed one at a time. Given sample access to the distribution of the costs, our goal is to learn an algorithm that minimizes the expected sum of the cost of the chosen alternative and the total time to find it. Algorithms for this problem fall into three different classes, non-adaptive which always query a fixed set of alternatives, partially-adaptive that query alternatives in a fixed order until they decide to stop and fully-adaptive that choose the next query based on the costs they've seen. While approximately optimal fully-adaptive strategies cannot be learned efficiently, our main result is that it is possible to learn a partially-adaptive strategy that approximates the best non-adaptive and partially-adaptive strategies efficiently both in terms of samples and computation. We extend our results to settings where multiple alternatives must be chosen and study the case where any k alternatives are feasible and the case where the alternatives must form a matroid base e.g. picking a minimum cost spanning tree.

• 20 publications
• 8 publications
• 13 publications
• 46 publications
• 1 publication
09/04/2020

In this paper we show that the priority search tree of McCreight, which ...
01/31/2022

### The Metric Distortion of Multiwinner Voting

We extend the recently introduced framework of metric distortion to mult...
05/26/2022

### Contextual Pandora's Box

Pandora's Box is a fundamental stochastic optimization problem, where th...
10/31/2018

### Stochastic Submodular Cover with Limited Adaptivity

In the submodular cover problem, we are given a non-negative monotone su...
12/02/2021

### Online Search With Best-Price and Query-Based Predictions

In the online (time-series) search problem, a player is presented with a...
11/01/2017

### The Price of Information in Combinatorial Optimization

Consider a network design application where we wish to lay down a minimu...
01/09/2020

### Understanding the Limitations of Network Online Learning

Studies of networked phenomena, such as interactions in online social me...

## 1 Introduction

Data-driven algorithms have been established as a powerful alternative to traditional algorithm design with worst-case guarantees. It has been repeatedly observed that using data to tune an algorithm’s parameters leads to significant performance gains. A large line of recent research [ACCL06, CMS10, GR17, BNVW17, BDSV18, BDV18, KLL17, WGS18, AKL19] studies how to identify a near-optimal algorithm with sample access to the distribution of instances. These works consider parameterized classes of algorithms aiming to learn parameters that either optimize the expected runtime or performance of the algorithm with respect to the underlying distribution.

Our work advances this recent line of research by considering a more flexible design space. Instead of considering a parameterized class of algorithms, we aim to optimize over larger, non-parametric classes of algorithms, ideally any polynomial time algorithm. To this end, we consider the most basic algorithm design problem: searching for a good solution over an unstructured space. In the worst case the underlying problem we study is trivial – the best search strategy visits all possible alternatives and takes linear time. With access to data, however, interesting complexity emerges depending on how the algorithm adapts to the information it observes.

We illustrate this with a motivating example. Suppose you want to buy an item online and look for a website that offers a cheap price. How would you go about searching for such a website? Can you design an algorithm to solve this problem? A linear time algorithm that explores all options is not very satisfying as it fails to exploit the structure that may be available in the problem:

• If you knew that no matter what you are looking for there are only a handful of websites offering the best prices, you wouldn’t need to search every option available.

• If all the websites had comparable prices, say $499.99, but the cheapest option was$499.98, it might not even be worth spending the effort looking for that option.

With access to data on how the prices vary across sites one could design a customized algorithm that exploits these structural properties. This is the core objective of data-driven algorithm design and it is widely applicable to many other problems. For another example, consider a route planning service that computes the shortest drive between locations requested by its clients. The best routes might change over time but there may only be very few relevant options to consider. Moreover, the service might want to settle for approximate answers if that means saving a lot of resources when handling the client’s request.

We propose a general framework for finding near-optimal search algorithms from data that captures the tradeoff of solution quality and time to find it. Our framework is inspired by the Pandora’s box problem, a prevalent model introduced by Weitzman [Wei79] in the Economics literature for describing how humans search among different alternatives. In our setting, there are

alternatives with unknown costs that are drawn from some joint distribution. A search algorithm examines these alternatives one at a time, learning their costs, and after a few steps stops and selects one of the alternatives. Given sample access to the cost distribution, our goal is to learn a search algorithm that minimizes the sum of the expected cost of the chosen alternative and the time it takes to find it. Henceforth we will refer to the alternatives as boxes and different samples (instantiations of costs in boxes) as scenarios.

A simple class of algorithms for the problem restrict the search space to a fixed subset of boxes and choose the cheapest option among them. We refer to these strategies as non-adaptive (NA). Despite their simplicity, learning good NA strategies from samples might not be possible. Indeed, if there is a tiny probability scenario that has infinite cost on all boxes but one, the expected cost of the algorithm would be infinite if the algorithm does not sample that scenario or query all boxes. Even though boundedness assumptions on the costs would alleviate this problem, there is a more serious computational bottleneck. Indeed, when costs are either high or zero, our goal is to find a subset of boxes that contains a zero cost option for all possible scenarios. This is equivalent to the hitting set formulation of set cover and it is hard to approximate better than a logarithmic factor in the number of scenarios which could be large.

In contrast to non-adaptive strategies, fully-adaptive (FA) strategies choose which box to query each time based on all the costs that have been observed so far. While these are the best strategies one could hope for, they are almost impossible to learn. For example, it could be the case that the first box completely determines the location of a box of cost 0 while every other box has infinite cost. Learning such an arbitrary mapping of cannot be done with a finite number of samples. While the best option can be identified with just one query, without knowing the mapping it is not possible to solve the problem with sublinearly many queries.

Even though both NA and FA strategies are hard to learn and approximate, we show that one can compute an FA strategy with few samples that is competitive against the optimal NA strategy. To give some intuition, consider the case where the costs in the boxes are either or

. As mentioned, the optimal NA strategy is to find a subset of boxes that contains at least one 0 in every scenario, a hitting set problem. Its linear programming relaxation, asks to minimize the size of the (fractional) subset

such that for every scenario with a zero pattern , . Now consider a strategy that every time opens a box with probability proportional to until it finds a box of 0 cost. No matter what the underlying scenario is, this strategy has probability at least , i.e. at least , of stopping with every new box and thus in expectation requires at most steps to stop. As the time will always be bounded between and , the strategy is efficiently learnable from samples even though costs might be infinite. Our more general result allows handling general costs where it is not clear when a low cost option has been identified.

It turns out that one can compete against more sophisticated strategies than non-adaptive. We consider partially-adaptive (PA) strategies that have a fixed order in which they query the boxes but may have arbitrarily complex rules on when to stop. Our main positive result is that it is possible to learn in polynomial time and using polynomially many samples a strategy that achieves a constant-approximation to the optimal PA strategy. The learned strategy is also a PA strategy. While one might hope that using FA strategies would even surpass PA strategies entirely, we show that a constant factor loss is necessary for computational reasons even when competing against NA strategies.

#### Scenario-aware strategies.

A key technique in obtaining our upper bounds is using ideas from online algorithms to significantly reduce the algorithm design space without significant loss in the approximation. While it is simpler to design algorithms when costs are either or , the case of general costs is trickier as it is unclear when a low cost option has been identified. One of our main technical contributions is to show that once we have determined an order in which to query boxes, it becomes easy to find an approximately optimal stopping rule at the loss of a small constant factor. This technical lemma is based on an extension of the ski-rental online algorithm [KMMO90] and is presented in Section 3. This allows us to focus on finding good scenario-aware strategies – that is, an ordering of the boxes that performs well assuming that we know when to stop. An additional important implication of the lemma is that the space of “interesting” PA strategies is small, characterized by the different orderings over boxes, and therefore approximately optimal PA strategies can be identified from few samples.

### 1.1 Results and Techniques

We now describe our results and techniques in more detail.

Our first result shows that it is possible to efficiently compute a scenario-aware PA strategy that beats any NA strategy entirely (Corollary 3.2). Combining this with our approximately-optimal stopping rule gives a PA strategy that achieves a constant factor approximation (1.58) to the optimal NA strategy. While a better constant factor approximation might be possible through a more direct argument, we show that it is NP-hard to approximate the optimal NA strategy beyond some constant (1.278) even if one is allowed to use FA strategies. Our lower-bound is based on the logarithmic lower-bound for set-cover [DS14] which restricts how many scenarios can be covered within the first time-steps (Lemma 3.1).

Our main result extends the above constant factor approximation guarantees even against PA strategies. We again restrict our attention to scenario-aware strategies and seek to find an ordering that approximates the optimal PA strategy. We solve the resulting problem by formulating a linear programming relaxation to identify for each scenario a set of “good” boxes with suitably low values. This allows us to reduce the problem at a cost of a constant factor to finding an ordering of boxes so that the expected time until a scenario visits one of its “good” boxes is minimized. This problem is known as the min-sum set cover problem and is known to be approximable within a factor of [FLT02]. The resulting approximation factor we obtain is 9.22.

Beyond the problem of identifying a single option with low cost, we also consider several extensions. One extension is the case where options must be identified so that the sum of their costs is minimized. A further generalization is the case where the set of options must form a base of rank in a given underlying matroid. This allows expressing many combinatorial problems in this framework such as the minimum spanning tree problem. For the first extension where any options are feasible (corresponding to a uniform matroid) we obtain a constant factor approximation. For general matroids however, the approximation factor decays to . We show that this is necessary even for the much weaker objective of approximating NA strategies with arbitrary FA and even for very simple matroids such as the partition matroid. We obtain the upper-bounds by modifying the techniques developed for extensions of min-sum set cover, the generalized min-sum set cover and the submodular ranking problem. The following table shows a summary of the results obtained.

We finally consider a modification of the framework to maximization instead of minimization problems where the goal is to maximize the value of the chosen alternative minus the time it takes to find it. In contrast to the minimization version, we show that the optimal NA strategy for this problem cannot be efficiently approximated within any constant factor.

### 1.2 Related Work

We already mentioned a lot of related work in the area of data-driven algorithms. Beyond this line of research, there has also been a lot of work in the context of improving algorithms using data that combines machine learning predictions to improve traditional worst case guarantees of online algorithms

[LV18, PSK18, HIKV19, GP19].

Our framework for designing and analyzing algorithms is inspired by the Pandora’s box model which has its roots in the Economics literature. This model explains how rational humans would search across different options to maximize their utility when there is a cost of discovering new alternatives. Weitzman [Wei79] gave a characterization of the optimal policy when values/costs in boxes are drawn i.i.d. from known distributions. The policy is given by a simple threshold rule and is related to the notion of the Gittins index [GJ74]. In contrast, in our setting, the distributions can be arbitrarily correlated and are only accessible through samples. Since Weitzman’s seminal work, there has been a large line of research studying the price of information [CFG00, GK01, CJK15, CHKK15] and the structure of approximately optimal rules for several combinatorial problems [Sin18, GGM06, GN13, ASW16, GNS16, GNS17, GJSS19].

Finally our work can also be seen as a generalization of the min-sum set cover problem (MSSC). Indeed MSSC corresponds to the special case where costs are either or . We exploit these connections to MSSC [FLT02] and its generalizations [AGY09, BGK10, SW11, ISvdZ14, AG11] to obtain competitive algorithms for our setting.

## 2 Model

In the optimal search problem, we are given a set of boxes with unknown costs and a distribution over a set of possible scenarios that determine these costs. Nature chooses a scenario from the distribution, which then instantiates the cost of each box. We use to denote the cost of box when scenario is instantiated.

The goal of the online algorithm is to choose a box of small cost while spending as little time as possible gathering information. The algorithm cannot directly observe the scenario that is instantiated, however, is allowed to “probe” boxes one at a time. Upon probing a box, the algorithm gets to observe the cost of the box. Formally let

be the random variable denoting the set of probed boxes when scenario

is instantiated and let be the (random) index of the box chosen by the algorithm. We require , that is, the algorithm must probe a box to choose it. Note that the randomness in the choice of and arises both from the random instantiation of scenarios as well as from any coins the algorithm itself may flip. Our goal then is to minimize the total probing time plus the cost of the chosen box:

 Es[mini∈Pscis+|Ps|].

Any online algorithm can be described by the pair , where is a permutation of the boxes representing the order in which they get probed, and is a stopping rule – the time at which the algorithm stops probing and returns the minimum cost it has seen so far. Observe that in its full generality, an algorithm may choose the th box to probe, , as a function of the identities and costs of the first boxes, and . Likewise, the decision of setting for may depend on and . Optimizing over the class of all such algorithms is intractable. So we will consider simpler classes of strategies, as formalized in the following definition.

###### Definition 2.1 (Adaptivity of Strategies).

In a Fully-Adaptive (FA) strategy, both and can depend on any costs seen in a previous time step, as described above.

In a Partially-Adaptive (PA) strategy, the sequence is independent of the costs observed in probed boxes. The sequence is determined before any boxes are probed. However, the stopping rule can depend on the identities and costs of boxes probed previously.

In a Non-Adaptive (NA) strategy, both and are fixed before any costs are revealed to the algorithm. In particular, the algorithm probes a fixed subset of the boxes, , and returns the minimum cost . The algorithm’s expected total cost is then .

#### General feasibility constraints.

In Section 6 we study extensions of the search problem where our goal is to pick multiple boxes satisfying a given feasibility constraint. Let denote the feasibility constraint. Our goal is to probe boxes in some order and select a subset of the probed boxes that is feasible. Once again we can describe an algorithm using the pair where denotes the probing order, and denotes the stopping time at which the algorithm stops and returns the cheapest feasible set found so far. The total cost of the algorithm then is the cost of the feasible set returned plus the stopping time. We emphasize that the algorithm faces the same feasibility constraint in every scenario. We consider two different kinds of feasibility constraints. In the first, the algorithm is required to select exactly boxes for some . In the second, the algorithm is required to select a basis of a given matroid.

## 3 A reduction to scenario-aware strategies and its implications to learning

Recall that designing a PA strategy involves determining a non-adaptive probing order, and a good stopping rule for that probing order. We do not place any bounds on the number of different scenarios, , or the support size and range of the boxes’ costs. These numbers can be exponential or even unbounded. As a result, the optimal stopping rule can be very complicated and it appears to be challenging to characterize the set of all possible PA strategies. We simplify the optimization problem by providing extra power to the algorithm and then removing this power at a small loss in approximation factor.

In particular, we define a Scenario-Aware Partially-Adaptive (SPA) strategy as one where the probing order is independent of the costs observed in probed boxes, however, the stopping time is a function of the instantiated scenario . In other words, the algorithm fixes a probing order, then learns of the scenario instantiated, and then determines a stopping rule for the chosen probing order based on the revealed scenario.

Observe that once a probing order and instantiated scenario are fixed, it is trivial to determine an optimal stopping time in a scenario aware manner. The problem therefore boils down to determining a good probing order. The space of all possible SPA strategies is also likewise much smaller and simpler than the space of all possible PA strategies. We can therefore argue that in order to learn a good SPA strategy, it suffices to optimize over a small sample of scenarios drawn randomly from the underlying distribution. We denote the cost of an SPA strategy with probing order by .

On the other hand, we argue that scenario-awareness does not buy much power for the algorithm. In particular, given any fixed probing order, we can construct a stopping time that depends only on the observed costs, but that achieves a constant factor approximation to the optimal scenario-aware stopping time for that probing order.

The rest of this section is organized as follows. In Section 3.1 we exhibit a connection between our problem and a generalized version of the ski rental problem to show that PA strategies are competitive against SPA strategies. In Section 3.2 we show that optimizing for SPA strategies over a small sample of scenarios suffices to obtain a good approximation. In Section 3.3 we develop LP relaxations for the optimal NA and SPA strategies. Then in the remainder of the paper we focus on finding approximately-optimal SPA strategies over a small set of scenarios.

### 3.1 Ski Rental with varying costs

We now define a generalized version of the ski rental problem which is closely related to SPA strategies. The input to the generalized version is a sequence of non-increasing buy costs, . These costs are presented one at a time to the algorithm. At each step , the algorithm decides to either rent at a cost of , or buy at a cost of . If the algorithm decides to buy, then it incurs no further costs for the remainder of the process. Observe that an offline algorithm that knows the entire cost sequence can pay . We call this problem “ski rental with time-varying buy costs”. The original ski rental problem is the special case where or from he time we stop skiing and on.

We first provide a simple randomized algorithm for ski rental with time-varying buy costs that achieves a competitive ratio of . Our algorithm uses the randomized algorithm of [KMMO90] for ski rental as a building block, essentially by starting a new instance of ski rental every time the cost of the offline optimum changes. The full proof of this result is included in Section A of the appendix.

###### Lemma 3.1 (Ski Rental with time-varying buy costs).

Consider any sequence . There exists an online algorithm that chooses a stopping time so that:

 t−1+at≤ee−1minj{j−1+aj}

The next lemma connects scenario-aware partially-adaptive strategies with partially-adaptive strategy through our competitive algorithm for ski-rental with time-varying costs. Specifically, given an SPA strategy, we construct an instance of the ski-rental problem, where the buy cost at any step is equal to the cost of the best feasible solution seen so far by the SPA strategy. The rent cost of the ski rental instance reflect the probing time of the search algorithm, whereas the buy cost reflects the cost of the boxes chosen by the algorithm. Our algorithm for ski rental chooses a stopping time as a function of the costs observed in the past and without knowing the (scenario-dependent) costs to be revealed in the future, and therefore gives us a PA strategy for the search problem. This result is formalized below; the proof can be found in Section A of the appendix.

###### Corollary 3.2.

Given any scenario-aware partially-adaptive strategy , we can efficiently construct a stopping time , such that the cost of the partially-adaptive strategy is no more than a factor of times the cost of .

### 3.2 Learning a good probing order

Henceforth, we focus on designing good scenario-aware partially adaptive strategies for the search problem. As noted previously, once we fix a probing order, determining the optimal scenario-aware stopping time is easy. We will now show that in order to optimize over all possible probing orders, it suffices to optimize with respect to a small set of scenarios drawn randomly from the underlying distribution.

Formally, let denote the distribution over scenarios and let be a collection of scenarios drawn independently from , with being a large enough polynomial in . Then, we claim that with high probability, for every probing order , is close to , where denotes the total expected cost of the SPA strategy over the scenario distribution , and

denotes its cost over the uniform distribution over the sample

. The implication is that it suffices for us to optimize for SPA strategies over scenario distributions with finite small support.

###### Lemma 3.3.

Let be given parameters. Let be a set of scenarios chosen independently at random from with . Then, with probability at least , for all permutations , we have

 costS(π)∈(1±ϵ)costD(π).
###### Proof.

Fix a permutation . For scenario , let denote the total cost incurred by SPA strategy in scenario . Observe that for any and any , we have . Furthermore, , and . The lemma now follows by using the Hoeffding inequality and applying the union bound over all possible permutations . ∎

Combining Corollary 3.2 and Lemma 3.3 yields the following theorem.

###### Theorem 3.4.

Suppose there exists an algorithm for the optimal search problem that runs in time polynomial in the number of boxes and the number of scenarios , and returns an SPA strategy achieving an -approximation. Then, for any , there exists an algorithm that runs in time polynomial in and and returns a PA strategy with competitive ratio , where .

### 3.3 LP formulations

We will now construct an LP relaxation for the optimal scenario-aware partially adaptive strategy. Following Theorem 3.4 we focus on the setting where the scenario distribution is uniform over a small support set .

The program (LP-SPA) is given below and is similar to the one used for the generalized min-sum set cover problem in [BGK10] and [SW11]. Let be an indicator variable for whether box is opened at time . Constraints (1) and (2) model matching constraints between boxes and time slots. indicates whether box is selected for scenario at time . Constraints (3) ensure that we only select open boxes. Constraints (4) ensure that for every scenario we have selected exactly one box. The cost of the box assigned to scenario is given by . Furthermore, for any scenario and time , the sum indicates whether the scenario is covered at time , and therefore, the probing time for the scenario is given by .

 minimize 1|S|∑i∈B,s∈S,t∈[n]tzist +1|S|∑i∈B,s∈S,t∈[n]ciszist (LP-SPA) subject to ∑i∈Bxit =1, ∀t∈[n] (1) ∑t∈[n]xit ≤1, ∀i∈B (2) zist ≤xit, ∀s∈S,i∈B,t∈[n] (3) ∑t′≤n,i∈Bzist′ =1, ∀s∈S (4) xit,zist ∈[0,1] ∀s∈S,i∈B,t∈[n]

As a warm-up for our main result, we approximate the optimal NA strategy by a PA strategy. The relaxation  (LP-NA) for the optimal NA strategy is simpler. Here is an indicator variable for whether box is opened and indicates whether box is assigned to scenario .

 minimize ∑i∈Bxi + 1|S|∑i∈B,s∈Sciszis (LP-NA) subject to ∑i∈Bzis = 1, ∀s∈S (5) zis ≤ xi, ∀i∈B,s∈S xi,zis ∈ [0,1] ∀i∈B,s∈S

## 4 Competing with the non-adaptive benchmark

As a warm-up to our main result, in this section we consider competing against the optimal non-adaptive strategy. Recall that a non-adaptive strategy probes a fixed subset of boxes, and then picks a probed box of minimum cost. Is it possible to efficiently find an adaptive strategy that performs just as well? We show two results. On the one hand, in Section 4.1 we show that we can efficiently find an SPA strategy that beats the performance of the optimal NA strategy. This along with Theorem 3.4 implies that we can efficiently find an -competitive PA strategy. On the other hand, in Section 4.2 we show that it is NP-hard to obtain a competitive ratio better than against the optimal NA strategy even using the full power of FA strategies.

### 4.1 An upper bound via PA strategies

Our main result of this section is as follows.

###### Lemma 4.1.

There exists a scenario-aware partially-adaptive strategy with competitive ratio against the optimal non-adaptive strategy.

Putting this together with Theorem 3.4 we get the following theorem.

###### Theorem 4.2.

We can efficiently find a partially-adaptive strategy with total expected cost at most times the total cost of the optimal non-adaptive strategy.

###### Proof of Lemma 4.1.

We use the LP relaxation (LP-NA) from Section 2. Given an optimal fractional solution , denote by the cost for scenario in this solution, and by the cost for all scenarios. Let denote the probing time for the fractional solution. Similarly, we define , and to be the algorithm’s query time, cost for all scenarios and cost for scenario respectively.

Algorithm 1 rounds to an SPA strategy. Note that the probing order in the rounded solution is independent of the instantiated scenario, but the stopping time depends on the scenario specific variables . is not necessarily the optimal stopping time for the constructed probing order, but its definition allows us to relate the cost of our solution to the fractional cost .

Notice that for each step , the probability of stopping is

 Pr[Stop at step t]=∑i∈Bxi∑i∈Bxizisxi=∑i∈Bzis∑i∈Bxi=1OPTt,

where we used the first set of LP constraints (5) and the definition of . Observe that the probability is independent of the step and therefore . The expected cost of the algorithm is

 E[ALGc,s] =∑i∈B,t∈[n]Pr[select i % at step t | stop at step t]Pr[stop at % step t]cis ≤∑i∈B,t∈[n]zis∑i∈BzisPr[stop at step t]cis=∑i∈Bziscis=OPTc,s

Take expectation over all scenarios we get . Thus the lemma follows. ∎

### 4.2 A lower bound for FA strategies

We now show that we cannot achieve a competitive ratio of against the optimal NA strategy even if we use the full power of fully adaptive strategies.

###### Theorem 4.3.

Assuming PNP, no computationally efficient fully-adaptive algorithm can approximate the optimal non-adaptive strategy within a factor smaller than .

Our lower bound is based on the hardness of approximating Set Cover. We use the following lemma which rules out bicriteria results for Set Cover; a proof can be found in Appendix B.

###### Lemma 4.4.

Unless P=NP, for any constant , there is no algorithm that for every instance of Set Cover finds sets that cover at least of the elements for some integer .

###### Proof of Theorem 4.3.

Let and be appropriate constants, to be determined later. We will define a family of instances of the optimal search problem based on set cover. Let be a set cover instance with elements and sets. Denote its optimal value by . To transform this into an instance of the search problem, every element corresponds to a scenario , and every set to a box . We set iff , otherwise . We also add a new scenario with . Scenario occurs with probability and all the other scenarios happen with probability each.

In this instance, the total cost of optimal non-adaptive strategy is , since we may pay the set-cover cost to cover all scenarios other than , and pay an additional cost to cover .

Consider any computationally efficient algorithm that returns a fully adaptive strategy for such an instance. Since the costs of the boxes are or , we may assume without loss of generality that any FA strategy stops probing as soon as it observes a box with cost and chooses that box. We say that the strategy covers a scenario when it finds a box of cost in that scenario. Furthermore, prior to finding a box with cost , the FA strategy learns no information about which scenario it is in other than that the scenario is as yet uncovered. Consequently, the strategy follows a fixed probing order that is independent of the scenario that is instantiated. We can now convert such a strategy into a bicriteria approximation for the underlying set cover instance. In particular, for , let denote the number of scenarios that are covered by the first probed boxes. Then, we obtain a solution to with sets covering elements. By Lemma 4.4 then, for every , there must exist an instance of set cover, , and by extension an instance of optimal search, on which satisfies for all .

For the rest of the argument, we focus on that hard instance for . Let denote the maximum number of boxes probes before stopping to return a box of cost .111We may safely assume that . Then the expected query time of the strategy is at least

 Pr[s=X]⋅N +Pr[s≠X]N∑k=1Pr[FA reaches step k|s≠X] ≥pN+(1−p)N∑k=1(1−1+εOPT% SC)k−1 =pN+(1−p)⎛⎝1−(1−1+εOPTSC)N⎞⎠OPTSC1+ε. (6)

On the other hand, the expected cost of the FA strategy is at least

 H(Pr[s=X]+Pr[s≠X∧FA didn't % find cost 0 in first N steps ])≥pH+(1−p)H(1−1+εOPTSC)N.

Thus the total cost of such fully-adaptive strategy is lower bounded by

 ALGFA≥pH+(1−p)H(1−1+εOPTSC)N+pN+(1−p)⎛⎝1−(1−1+εOPTSC)N⎞⎠OPTSC1+ε.

Let be defined so that . Then, . Substituting these expressions in the above equation we get

 ALGFA≥pH+(1−p)He−x+p⋅x(OPTSC1+ε−1)+(1−p)(1−e−x)OPTSC1+ε.

The RHS is minimized at . By setting , and , the competitive ratio becomes

 ALGFAOPTNA≥1.278

when . ∎

## 5 Competing with the partially-adaptive benchmark

Moving on to our main result, in this section we compete against the optimal partially-adaptive strategy. Recall that the program (LP-SPA) is a relaxation for the optimal SPA strategy, and therefore, also bounds from below the cost of the optimal PA strategy. We round the optimal solution to this LP to obtain a constant-competitive SPA strategy.

Given a solution to (LP-SPA), we identify for each scenario a subset of low cost boxes. Our goal is then to find a probing order, so that for each scenario we quickly find one of the low cost boxes. This problem of “covering” every scenario with a low cost box is identical to the min-sum set cover (MSSC) problem introduced by [FLT02]. Employing this connection allows us to convert an approximation for MSSC into an SPA strategy at a slight loss in approximation factor. Our main result is as follows.

###### Lemma 5.1.

There exists a scenario-aware partially-adaptive strategy with competitive ratio against the optimal partially-adaptive strategy.

Combining this with Theorem 3.4 we get the following theorem.

###### Theorem 5.2.

We can efficiently find a partially-adaptive strategy that is -competitive against the optimal partially-adaptive strategy.

###### Proof of Lemma 5.1.

We use the LP formulation LP-SPA from Section 2. Recall that denotes the extent to which box is opened at time , denotes the extent to which box is chosen for scenario at time .

As mentioned previously, we will employ a -approximation to the MSSC by [FLT02] in our algorithm. The input to MSSC is an instance of set cover. In our context, the elements are the boxes and each scenario has a set corresponding to it. The goal is to find an ordering over the items so as to minimize the sum of the cover times of the sets, where the cover time of a set is the index of the first element in that is contained in the set. The following is an LP relaxation for MSSC; observe its similarity to (LP-SPA). [FLT02] provide a greedy algorithm that -approximates the optimal solution to this LP.

 minimize 1|S|∑i∈B,s∈S,t∈[n]tzist (LP-MSSC) subject to ∑t′≤n,i∈Lszist′ ≥1, ∀s∈S xit,zist ∈[0,1] ∀s∈S,i∈B,t∈[n]

Define . Given an optimal solution to (LP-SPA), we will now construct an instance of MSSC (by specifying the sets ) with the following properties:

1. There exists an integral solution for with cover time at most times the query time for .

2. Any integral solution for can be paired with an appropriate stopping time , so that the query time of is at most the MSSC cover time of , and the cost of is at most times the fractional cost for .

#### Constructing a “good” I′:

For each scenario , we define a set of “low” cost boxes as

 Ls={i:cis≤αOPTIc,s}.

The second property above is immediate from this definition. In particular, we define the stopping time as the first time we encounter a box .

For property (1), we first show that instance admits a good fractional solution. While (LP-SPA) allows assigning any arbitrary boxes to a scenario, (LP-MSSC) requires assigning only the boxes in to scenario . Observe that by Markov’s inequality, for all , . In order to convert this into a feasible solution to (LP-MSSC), we first scale up all of the variables by a factor of . Specifically, set ; for all ; and for all . Then, for all . However, this may violate constraints (1)–(3).

To fix (2), if for some we have , let be the smallest time at which . We set for all and . Likewise, modify so as to achieve (3) as well as ensure that every variable lies in .

It remains to argue that constraints (1) can be fixed at a small extra cost. Observe that for any , . We therefore “dilate” time by a factor of in order to accommodate the higher load. Formally, interpret , , and as continuous step functions of . Then the objective function of (LP-MSSC) can be written as . Dilating time by a factor of gives us the objective .

Since for any , the expected query time is upper bounded by

 1|S|∑i,s∫t=nt=0⌈tαα−1⌉⋅αα−1zistdt ≤1|S|∑i,s,t(αα−1)2tzist =(αα−1)2⋅ Query time of (x,z)

where for the second inequality we used the following Lemma 5.3 with . The proof of the lemma is deferred to appendix.

###### Lemma 5.3.

For any ,

 ∫tt−1⌈βt′⌉dt′≤βt.

#### Applying greedy algorithm for min-sum set cover.

We have so far constructed a new instance of MSSC along with a feasible fractional solution with cover time at most times the query time for . The greedy algorithm of [FLT02] finds a probing order over the boxes with query time at most times the cover time of , that is, at most times the query time for , where the equality follows from the definition of . Property (1) therefore holds and the lemma follows. ∎

## 6 Extension to other feasibility constraints

In this section we extend the problem in cases where there is a feasibility constraint , that limits what or how many boxes we can choose. We consider the cases where we are required to select distinct boxes, and independent boxes from a matroid. In both cases we design SPA strategies that can be converted to PA. These two variants are described in more detail in subsections 6.1 and 6.2 that follow.

### 6.1 Selecting k items

In this section requires that we pick boxes to minimize the total cost and query time. As in Section 5 we aim to compete against the optimal partially-adaptive strategy. We design a PA strategy which achieves an -competitive ratio. If , the problem is the generalized min-sum set cover problem first introduced in [AGY09]. [AGY09] gave a -approximation, which then was improved to a constant in [BGK10] via an LP-rounding based algorithm. Our proof will follow the latter proof in spirit, and generalize to the case where boxes have arbitrary values. Our main result is the following.

###### Lemma 6.1.

There exists a scenario-aware partially-adaptive -competitive algorithm to the optimal partially-adaptive algorithm for picking boxes.

Combining this with Theorem 3.4 we get the following theorem.

###### Theorem 6.2.

We can efficiently find a partially-adaptive strategy for optimal search with options that is -competitive against the optimal partially-adaptive strategy.

###### Proof of Lemma 6.1.

The LP formulation we use for this problem is a variant of (LP-SPA) from Section 2, with the following changes; we introduce variable which denotes the extend to which scenario is covered until time and constraint 4 is replaced by constraint 7. The program (LP--cover) is presented below. Denote by and the contribution of the query time and cost of scenario in optimal fractional solution. and denote the corresponding quantities for the algorithm.

 minimize 1|S|∑s∈S,t∈[n](1−yst) + 1|S|∑i∈B,s∈S,t∈[n]ciszist (LP-k-cover ) subject to ∑i∈Bxit = 1, ∀t∈[n] ∑t∈[n]xit ≤ 1, ∀i∈B zist ≤ xit, ∀s∈S,i∈B,t∈[n] ∑t′≤t,i∉Azist′ ≥ (k−|A|)yst, ∀A⊆B,s∈S,t∈[n] (7) xit, zist, yst∈[0,1] ∀s∈S,i∈B,t∈[n]

The LP formulation we use is exponential in size but we can efficiently find a separation oracle, as observed in Section 3.1 in [BGK10].

We claim that Algorithm 2 satisfies the lemma. The algorithm first finds an opening sequence by probing boxes with some probability at every step, and then select every opened box with some probability until boxes are probed. Note that the number of boxes probed at each “step” may be more than one. In the algorithm, we set constant .

Let be the latest time at which as in the description of the algorithm. As observed in [BGK10] for scenario , we pay at least for each time , thus

 OPTt,s≥t∗s2. (8)

Fix a scenario . We first analyze the expected probing time of the algorithm for this scenario. Denote by the first phase during which we have a non-zero probability of selecting a box for scenario . Notice that for each box , the probability that it is selected in phase is . The following lemma from [BGK10] bounds the probability that in each phase such that , at least boxes are selected.

###### Lemma 6.3 (Lemma 5.1 in [Bgk10]).

If each box is selected w.p. at least for , then with probability at least , at least different boxes are selected.

Let . Observe that the number of boxes probed in a phase are independent of the event that the algorithm reaches that phase prior to covering scenario . Therefore we get:

 E[query time following phase ℓ0] =∞∑ℓ=ℓ0E[\#boxes % probed in phase ℓ]⋅Pr[ALG reaches % phase ℓ] ≤∞∑ℓ=ℓ0∑i∈Bc∑t≤2ℓxist⋅ℓ−1∏j=ℓ0Pr[≤k boxes selected in phase j] ≤∞∑ℓ=ℓ0∑i∈Bc∑t≤2ℓxist⋅γℓ−ℓ0 ≤∞∑ℓ=ℓ02ℓc⋅γℓ−ℓ0 =2ℓ0c1−2γ<2t∗sc1−2γ≤4cOPTt,s1−2γ

Here the second line follows by noting that the algorithm can reach phase only if in each previous phase there are less than boxes selected. The third line is by Lemma 6.3. The last line is by and inequality (8). Since the expected query time at each phase is at most , thus the expected query time before phase is at most . Therefore the total query time of the algorithm for scenario is

 ALGt,s≤4cOPTt,s+4cOPTt,s1−2γ<123.25OPTt,s.

To bound the cost of our algorithm, we find the expected total value of any phase , conditioned on selecting at least distinct boxes in this phase.

 E[cost in phase ℓ |at least k boxes are selected in phase ℓ] ≤E[cost in phase ℓ]Pr[at least k boxes are selected in phase ℓ] ≤11−γE[cost in phase ℓ] ≤11−γ∑i∈Bc∑t≤2ℓzistcis=11−γcOPTc,s<11.85OPTc,s.

Here the third line is by Lemma 6.3 and the last line is by definition of . Notice that the upper bound does not depend on the phase , so the same upper bound holds for . Thus the total cost contributed from scenario in our algorithm is

 ALGs=ALGt,s+ALGc,s<123.25OPTt,s+11.85OPTc,s≤123.25OPTs.

Taking the expectation over all scenarios , we conclude that the scenario-aware strategy gives constant competitive ratio to the optimal partially-adaptive strategy. ∎

### 6.2 Picking a matroid basis of rank k

In this section requires us to select a basis of a given matroid. More specifically, assuming that boxes have an underlying matroid structure we seek to find a base of size with the minimum cost. We first design a scenario-aware partially-adaptive strategy in Lemma 6.4 that is -competitive against optimal partially-adaptive strategy. Then, in Theorem 6.7 we argue that such competitive ratio is asymptotically tight.

###### Lemma 6.4.

There exists a scenario-aware partially-adaptive -approximate algorithm to the optimal partially-adaptive algorithm for picking a matroid basis of rank .

Combining this lemma with Theorem 3.4 we get the following theorem.

###### Theorem 6.5.

We can efficiently find a partially-adaptive strategy for optimal search over a matroid of rank that is -competitive against the optimal partially-adaptive strategy.

The LP formulation is similar to the one for the -coverage constraint, presented in the previous section. Let for any set denote the rank of this set. The constraints are the same except for constraints (9) and (10) that ensure we select no more than the rank of a set and that the elements that remain unselected are adequate for us to cover the remaining rank respectively.

 minimize 1|S|∑s∈S,t∈[n](1−yst) + 1|S|∑i∈B,s∈S,t∈[n]vsizist (LP-matroid) subject to ∑i∈Bxit = 1, ∀t∈[n] ∑t∈[n]xit ≤ 1, ∀i∈B ∑t∈[n],i∈Azist ≤ r(A), ∀s∈S,A⊆B (9) zist ≤ xit, ∀s∈S,i∈B,t∈[n] ∑i∉A∑t′≤tzist′ ≥ (r([n])−r(A))yst, ∀A⊆B,s∈S,t∈[n] (10) xit, zist, yst∈[0,1] ∀s∈S,i∈B,t∈[n]

#### Solving the LP efficiently

The LP formulation we use is exponential in size but we can efficiently find a separation oracle. Every set of constraints can be verified in polynomial time except for constraints (10). Rewriting these last constraints we get

 ∑i∑t′≤tzist′−∑i∈A∑t′≤tzist′≥r([n])−r(A),∀A⊆B,t∈[n].

Then the problem is equivalent to minimizing the function over all subsets of items . The function is submodular since the rank function is submodular, therefore we can minimize it in polynomial time [GLS81]. The formal statement of the main theorem is the following.

###### Proof of Lemma 6.4.

We claim that Algorithm 3 satisfies the lemma. The algorithm first finds an opening sequence by probing boxes with some probability at every step, and then knowing the scenario selects every opened box with some probability until a basis of rank is found. In the algorithm we set constant .

In scenario , let phase be when . The proof has almost identical flow as the proof for -coverage case. We still divide the time after into exponentially increasing phases, while in each phase we prove that our success probability is a constant. The following lemma gives an upper bound for the query time needed in each phase to get a full rank base of the matroid. The proof will get deferred to appendix.

###### Lemma 6.6.

In phase , the expected query time needed to select a set of full rank is at most .

Define to be the random variable indicating number of steps needed to build a full rank subset. The probability that we build a full rank basis within some phase is

 Pr[X≤2ℓ−1t∗s]≥1−E[X]2ℓ−1t∗s=1−2ℓ+1t∗sc⋅2ℓ−1t∗s=1−4c, (11)

where we used Markov’s inequality for the first inequality and Lemma 6.6 for the second equality. To calculate the total query time, we sum up the contribution of all phases.

 E[query time following phase 1] =∞∑ℓ=1E[query time at phase % ℓ]⋅Pr[ALG reaches phase ℓ] ≤∞∑ℓ=12ℓt∗s∑t=2ℓ−1t∗s+1c⋅(4c)ℓ−1 =∞∑ℓ=12ℓ−1t∗sclnk⋅(4c)ℓ−1 =c2lnkt∗sc−8≤2c2lnkOPTt,sc−8.

Here the third line uses that the expected number of boxes probed at each time step is at most . The last line uses