# Stochastic Submodular Cover with Limited Adaptivity

In the submodular cover problem, we are given a non-negative monotone submodular function f over a ground set E of items, and the goal is to choose a smallest subset S ⊆ E such that f(S) = Q where Q = f(E). In the stochastic version of the problem, we are given m stochastic items which are different random variables that independently realize to some item in E, and the goal is to find a smallest set of stochastic items whose realization R satisfies f(R) = Q. The problem captures as a special case the stochastic set cover problem and more generally, stochastic covering integer programs. We define an r-round adaptive algorithm to be an algorithm that chooses a permutation of all available items in each round k ∈ [r], and a threshold τ_k, and realizes items in the order specified by the permutation until the function value is at least τ_k. The permutation for each round k is chosen adaptively based on the realization in the previous rounds, but the ordering inside each round remains fixed regardless of the realizations seen inside the round. Our main result is that for any integer r, there exists a poly-time r-round adaptive algorithm for stochastic submodular cover whose expected cost is Õ(Q^1/r) times the expected cost of a fully adaptive algorithm. Prior to our work, such a result was not known even for the case of r=1 and when f is the coverage function. On the other hand, we show that for any r, there exist instances of the stochastic submodular cover problem where no r-round adaptive algorithm can achieve better than Ω(Q^1/r) approximation to the expected cost of a fully adaptive algorithm. Our lower bound result holds even for coverage function and for algorithms with unbounded computational power.

## Authors

• 6 publications
• 27 publications
• 15 publications
• ### The Power of Adaptivity for Stochastic Submodular Cover

In the stochastic submodular cover problem, the goal is to select a subs...
06/30/2021 ∙ by Rohan Ghuge, et al. ∙ 0

• ### A Tight Bound for Stochastic Submodular Cover

We show that the Adaptive Greedy algorithm of Golovin and Krause (2011) ...
02/01/2021 ∙ by Lisa Hellerstein, et al. ∙ 0

• ### A polynomial lower bound on adaptive complexity of submodular maximization

In large-data applications, it is desirable to design algorithms with a ...
02/21/2020 ∙ by Wenzheng Li, et al. ∙ 0

• ### Learning Optimal Search Algorithms from Data

Classical algorithm design is geared towards worst case instances and fa...
11/05/2019 ∙ by Shuchi Chawla, et al. ∙ 0

• ### Adaptive Regularized Submodular Maximization

In this paper, we study the problem of maximizing the difference between...
02/28/2021 ∙ by Shaojie Tang, et al. ∙ 0

• ### Improved Approximation Algorithms for Inventory Problems

We give new approximation algorithms for the submodular joint replenishm...
11/30/2019 ∙ by Thomas Bosman, et al. ∙ 0

• ### Minimum Robust Multi-Submodular Cover for Fairness

In this paper, we study a novel problem, Minimum Robust Multi-Submodular...
12/14/2020 ∙ by Lan N. Nguyen, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Submodular functions naturally arise in many applications domains including algorithmic game theory, machine learning, and social choice theory, and have been extensively studied in combinatorial optimization. Many computational problems can be modeled as the

submodular cover problem where we are given a non-negative monotone submodular function over a ground set , and the goal is to choose a smallest subset such that where . A well-studied special case is the set cover problem where the function is the coverage function and the items correspond to subsets of an underlying universe. Even this special case is known to be NP-hard to approximate to a factor better than  [22, 25, 35, 36], and on the other hand, the classic paper of Wolsey [44] shows that the problem admits a poly-time -approximation for any integer-valued monotone submodular function.

In this work we consider the stochastic version

of the problem that naturally arises when there is uncertainty about items. For instance, in stochastic influence spread in networks, the set of nodes that can be influenced by any particular node is a random variable whose value depends on the realized state of the influencing node (e.g. being successfully activated). In sensor placement problems, each sensor can fail partially or entirely with certain probability and the coverage of a sensor depends on whether the sensor failed or not. In data acquisition for machine learning (ML) tasks, each data point is apriori a random variable that can take different values, and one may wish to build a dataset representing a diverse set of values. For example, if one wants to build a ML model for identifying a new disease from gene patterns, one would start by building a database of gene patterns associated to that disease. In this case, each person’s gene pattern is a random variable that can realize to different values depending on the race, gender, etc. For other examples, we refer the reader to

[34] (application in databases) and [2] (application in document retrieval).

In the stochastic submodular cover problem, we are given stochastic items which are different random variables that independently realize to an element of , and the goal is to find a lowest cost set of stochastic items whose realization satisfies . In network influence spread problems each item corresponds to a node in the network, and its realization corresponds to the set of nodes it can influence. In sensor placement problems an item corresponds to a sensor and its realization corresponds to the area that it covers upon being deployed. In the case of data acquisition, an item corresponds to a data point and its realization corresponds to the value it takes upon being queried. The problem captures as a special case the stochastic set cover problem and more generally, stochastic covering integer programs.

Motivated by this striking separation between the power of adaptive and non-adaptive algorithms, we consider the following question in this work: does one need full power of adaptivity to obtain a near-optimal solution to stochastic submodular cover? In particular, how does the performance guarantees change when an algorithm interpolates between these two extremes using a few rounds of adaptivity.

Towards this end, we define an -round adaptive algorithm to be an algorithm that chooses a permutation of all available items in each round , and a threshold , and realizes items in the order specified by the permutation until the function value is at least . A non-adaptive algorithm would then correspond to the case (with ), and an adaptive algorithm would correspond to the case (with for all ). The permutation for each round is chosen adaptively based on the realization in the previous rounds, but the ordering inside each round remains fixed regardless of the realizations seen inside the round. We will call this the “permutation framework” for an -round algorithm.

Our main result is that for any integer , there exists a poly-time -round adaptive algorithm for stochastic submodular cover whose expected cost is times the expected cost of a fully adaptive algorithm, where the notation is hiding a logarithmic dependence on the number of items and the maximum cost of any item. Prior to our work, such a result was not known even for the case of and when is the coverage function. Indeed achieving such a result was cast as an open problem by Goemans and Vondrak [26] who achieved an bound (corresponding to ) on the adaptivity gap of stochastic set cover. Furthermore, we show that for any , there exist instances of the stochastic submodular cover problem where no -round adaptive algorithm can achieve better than approximation to the expected cost of a fully adaptive algorithm. Our lower bound result holds even for coverage function and for algorithms with unbounded computational power. Thus our work shows that logarithmic rounds of adaptivity are necessary and sufficient to obtain near-optimal solutions to the stochastic submodular cover problem, and even few rounds of adaptivity are sufficient to sharply reduce the adaptivity gap.

###### Remark 1.1.

One may consider an alternate notion of -round adaptive algorithm: In each round , the algorithm chooses a fixed set of items to realize in parallel where the choice of the set depends on the realizations in the previous rounds (instead of a permutation over items). Let us call this framework the “set framework”. One benefit of this variation is that items in each round can be realized in parallel. Unfortunately in this framework, any algorithm that always outputs a valid cover (as is our requirement), must in general include all remaining items in the last round, because for any proper subset of the remaining items there will be positive probability that this subset will not able to cover the entire set. Hence, the -round adaptivity gap would be .

Hence, one would have to consider a relaxed version of the problem and require that the algorithm achieves the desired coverage guarantee only with probability . Our algorithmic results directly carry over to this variant of the problem. In particular, for any fixed , we obtain poly-time -round adaptive algorithm in the set framework whose cost is times the expected cost of a fully adaptive algorithm, and that succeeds with probability at least . At the same time, our lower bound of continues to hold in this relaxed setting. In the following we will provide results for only the permutation framework, with the understanding that all our results carry over to the set framework with the relaxed version of the problem.

### 1.1 Problem Statement

Let be a collection of independent random variables each supported on the same ground set and be an integer-valued111We present our results for integer-valued functions for simplicity of exposition. All our results can easily be generalized to positive real-valued functions. non-negative monotone submodular function . We will refer to random variables ’s as items and any set as a set of items. For any , we use to refer to a realization of item (random variable) and define as the realization of X. We slightly abuse notation222Note that here is being extended to a function , but we chose to refer to as . and extend to the ground set of items X such that for any set , : this definition means that for any realization of , . Finally, there is an integer-valued cost associated with item .

Let . For any set of items , we say that a realization of S is feasible iff . We will assume that any realization of is always feasible, i.e. 333One can ensure this by adding an item to the ground set such that for all realizations of , but cost of this item is higher than the combined cost of all other items.. We will say that a realization of is covered by a realization of iff is feasible. The goal in the stochastic submodular cover problem is to find a set of items with the minimum cost which gets realized to a feasible set. In order to do so, if we include any item to S we pay a cost , and once included, would be realized to some and is fixed from now on. Once a decision made regarding inclusion of an item in S, this item cannot be removed from S.

For any set of items , we define to be the total cost of all items in , i.e. , where is an indicator function. For any algorithm , we refer to the total cost of solution returned by on an instantiation of X as the cost of on denoted by . We are interested in minimizing the expected cost of the algorithm , i.e., .

[hidealllines=false,backgroundcolor=gray!10,innertopmargin=0pt]

###### Example 1.1 (Stochastic Set Cover).

A canonical example of the stochastic submodular cover problem is the stochastic set cover problem. Let be a universe of “elements” (not to be mistaken with “items”) and be a collection of random variables where each random variable is supported on subsets of , i.e., realizes to some subset . We refer to each random variable as a stochastic set. In the stochastic set cover problem, the goal is to pick a smallest (or minimum weight) collection S of items (or equivalently sets) in X such that the realized sets in this collections cover the universe .

We consider the following types of algorithms (sometimes referred to as policies in the literature) for the stochastic submodular cover problem:

• [itemsep=0pt,leftmargin=10pt]

• Non-adaptive: A non-adaptive algorithm simply picks a fixed ordering of items in X and insert the items one by one to S until the realization of S become feasible.

• Adaptive: An adaptive algorithm on the other hand picks the next item to be included in S adaptively based on the realization of previously chosen items. In other words, the choice of each item to be included in S is now a function of the realization of items already in S.

• -round adaptive: We define -round adaptive algorithms as an “interpolation” between the above two extremes. For any integer , an -round adaptive algorithm chooses the items to be included in S in rounds of adaptivity: In each round , the algorithm chooses a threshold and an ordering over items, and then inserts the items one by one according to this ordering to S until for the realized set , . Once this round finishes, the algorithm decides on an ordering over the remaining items adaptively based on the current realization.

In above definitions, a non-adaptive algorithm corresponds to case of round adaptive algorithm (with ) and a (fully) adaptive algorithm corresponds to the case of (here is irrelevant and can be thought as being zero).

We use OPT to refer to the optimal adaptive algorithm for the stochastic submodular cover problem, i.e., an adaptive algorithm with minimum expected cost. We use the expected cost of OPT as the main benchmark against which we compare the cost of other algorithms. In particular, we define adaptivity gap as the ratio between the expected cost of the best non-adaptive algorithm for the submodular cover problem and the expected cost of OPT. Similarly, for any integer , we define the -round adaptivity gap for -rounds adaptive algorithms in analogy with above definition.

###### Remark 1.2.

The notion of “best” non-adaptive or -round adaptive algorithm defined above allow unbounded computational power to the algorithm. Hence, the only limiting factor of the algorithm is the information-theoretic barrier caused by the uncertainty about the underlying realization.

### 1.2 Our Contributions

In this paper, we establish tight bounds (up to logarithmic factor) on the -round adaptivity gap of the stochastic submodular cover problem for any integer . Our main result is an -round adaptive algorithm (for any integer ) for the stochastic submodular cover problem.

[backgroundcolor=lightgray!40,topline=false,rightline=false,leftline=false,bottomline=false,innertopmargin=2pt]

###### Result 1 (Main Result).

For any integer and any monotone submodular function , there exists an -round adaptive algorithm for the stochastic submodular cover problem for function and set of items with cost of each item bounded by that incurs expected cost times the expected cost of the optimal adaptive algorithm.

A corollary of Result 1 is that the -round adaptivity gap of the submodular cover problem is . This implies that using only rounds of adaptivity, one can reduce the cost of the algorithm to within poly-logarithmic factor of the optimal adaptive algorithm. In other words, one can “harness” the (essentially) full power of adaptivity, in only logarithmic number of rounds.

Various stochastic covering problems can be cast as submodular cover problem, including the stochastic set cover problem and the stochastic covering integer programs studied previously in the literature [26, 27, 21]. As such, Result 1 directly extends to these problems as well. In particular, as a (very) special case of Result 1, we obtain that the adaptivity gap of the stochastic set cover problem is (here is the size of the universe), improving upon the bound of Goemans and Vondrak [26] and settling an open question in their work regarding the adaptivity gap of this problem (an lower bound was already shown in [26]).

We further prove that the -round adaptivity gaps in Result 1 are almost tight for any .

[backgroundcolor=lightgray!40,topline=false,rightline=false,leftline=false,bottomline=false,innertopmargin=2pt]

###### Result 2.

For any integer , there exists a monotone submodular function , in particular a coverage function, with such that the expected cost of any -round adaptive algorithm for the submodular cover problem for function , i.e., the stochastic set cover problem, is times the expected cost of the optimal adaptive algorithm.

Result 2 implies that the -round adaptivity gap of the submodular cover problem is , i.e., within poly-logarithmic factor of the upper bound in Result 1. An immediate corollary of this result is that rounds of adaptivity are necessary for reducing the cost of the algorithms to within logarithmic factors of the optimal adaptive algorithm. We further point out that interestingly, the optimal adaptive algorithm in instances created in Result 2 only requires rounds; as such, Result 2 in fact is proving a lower bound on the gap between the cost of -round and -round adaptive algorithms.

We remark that our algorithm in Result 1 is polynomial time (for polynomially-bounded item costs), while the lower bound in Result 2 holds again algorithms with unbounded computational power (see Remark 1.2).

### 1.3 Related Work

The problem of submodular cover was perhaps first studied by [44], who showed that a greedy algorithm achieves an approximation ratio of . Subsequent to this there has been a lot of work on this problem in various settings [27, 8, 9, 32, 21, 29, 33]. To our knowledge, the question of adaptivity in stochastic covering problems was first studied in [26] for the special case of stochastic set cover and covering integer programs. It was shown that the adaptivity gap of this problem is , where is the size of the universe to be covered. A non-adaptive algorithm for this problem with an adaptivity gap of was also presented.

Subsequently there has been a lot of work on stochastic set cover and the more general stochastic submodular cover problem in the fully adaptive setting. A special case of stochastic set cover was studied by [34] in the adaptive setting, and an adaptive greedy algorithm was studied444The paper originally claimed an approximation ratio of for this algorithm, however, the claim was later retracted by the authors due to an error in the original analysis [38]. In [27] the notion of “adaptive submodularity” was defined for adaptive optimization, which demands that given any partial realization of items, the marginal function with respect to this realization remains monotone submodular. This paper also presented an adaptive greedy algorithm for the problem of stochastic submodular cover, and stochastic submodular maximization subject to cardinality constraints.555It was originally claimed that this algorithm achieves an approximation ratio of where is the desired coverage, however, the claim was later retracted due to an error in the analysis [37]. The authors have claimed an approximation ratio of since then. In [32] a more general version of stochastic submodular cover problem was studied in the fully adaptive setting, and their results imply the best-possible approximation ratio of for stochastic submodular cover. In [21] an adaptive dual greedy algorithm was presented for this problem. It was also shown that the adaptive greedy algorithm of [27] achieves an approximation ratio of , where is the maximum function value any item can contribute, and is the maximum support size of the distribution of any item. There has also been work on this problem when the realization of items can be correlated, unlike our setting where the realization of each item is independent. In this setting, [33] gives an adaptive algorithm which achieves an approximation ratio of , where is the desired coverage, and

denote the support size of the joint distribution of these correlated items. In the case of independent realizations this quantity will typically be exponential in the number of items. In

[29] a similar result was shown for a slightly different algorithm.

The question of adaptivity has also been studied for a related problem of stochastic submodular maximization subject to cardinality constraints [4]. The goal in this problem is to find a set of items with cardinality at most , so as to maximize the expected value of a stochastic submodular function. This paper showed that a non-adaptive greedy algorithm for this problem achieves an approximation ratio of with respect to an optimal adaptive algorithm. This result was later generalized to stochastic submodular maximization subject to matroid constraints [3]. In [30], the adaptivity gap of stochastic submodular maximization subject to a variety of prefix-closed constraints was studied under the setting where the distribution of each item is Bernoulli. This class of prefix-closed constraints includes matroid and knapsack constraints among others. It was shown that there is a non-adaptive algorithm that achieves an approximation ratio of with respect to an optimal adaptive algorithm. In [31], the problem of stochastic submodular maximization was also studied under various types of constraints, including knapsack constraints. An approximation ratio of for this problem under knapsack constraint was given, where is the smallest probability of any element in the ground set being realized by any item. The question of adaptivity has also been studied for other stochastic problems such as stochastic packing, knapsack, matching etc. (see, e.g. [19, 20, 45, 12, 7, 6] and references therein).

There has also been a lot of work under the framework of -stage or multi-stage stochastic programming [40, 42, 16, 41]. In this framework, one has to make sequential decisions in a stochastic environment, and there is a parameter , such that the cost of making the same decision increases by a factor after each stage. The stochastic program in each stage is defined in terms of the expected cost in the later stages. The central question in these problems is– when can we find good solutions to this complex stochastic program, either by directly solving it or by finding approximations to it? This largely depends on the complexity of the stochastic program at hand. For example, if the distribution of the environment is explicitly given, then one might be able to solve the stochastic program exactly by using integer programming, and this question becomes largely computational in nature. This is fundamentally different than the information theoretic question we consider in this paper.

Aside from the stochastic setting, algorithms with limited adaptivity have been studied across a wide spectrum of areas in computer science including in sorting and selection (e.g. [43, 17, 13]), multi-armed bandits (e.g. [39, 1]), algorithms design (e.g. [11, 23, 24, 10]), among others; we refer the interested reader to these papers and references therein for more details.

###### Remark 1.3.

Our study of -round adaptive algorithm for submodular cover is reminiscent of a recent work of Chakrabarti and Wirth [15] on multi-pass streaming algorithms for the set cover problem. They showed that allowing additional passes over the input in the streaming setting (similar-in-spirit to more rounds of adaptivity) can significantly improve the performance of the algorithms and established tight pass-approximation tradeoffs that are similar (but not identical) to -round adaptivity gap bounds in Results 1 and Results 2. In terms of techniques, our upper bound result—our main contribution—is almost entirely disjoint from the techniques in [15] (and works for the more general problem of submodular cover, whereas the results in [15] are specific to set cover), while our lower bound uses similar instances as [15] but is based on an entirely different analysis.

### 1.4 Organization

In Section 2 we present some preliminaries for our problem. In Section 3 we present a technical overview of our main results. In Section 4 we present a non-adaptive selection algorithm that will be used to prove our upper bound result in Section 5. We present the lower bound result in Section 6.

## 2 Preliminaries

#### Notation.

Throughout this paper we will use symbols and to denote subsets of the ground set , and use symbols and to denote subsets of , i.e., indices of items. We will also use symbols and to denote subsets of which realize to subsets and of the ground set .

Submodular Functions: Let be a finite ground set and be the set of non-negative integers. For any set function , and any set , we define the marginal contribution to as the set function such that for all ,

 fS(T)=f(S∪T)−f(S).

When clear from the context, we abuse the notation and for , we use and instead of and .

A set function is submodular iff for all and : . Function is additionally monotone iff . Throughout the paper, we solely focus on monotone submodular functions unless stated explicitly otherwise.

We use the following two well-known facts about submodular functions throughout the paper.

###### Fact 2.1.

Let be a monotone submodular function, then:

 ∀S,T⊆Ef(S)≤f(T)+∑e∈S∖TfT(e).
###### Fact 2.2.

Let be a monotone submodular function, then for any , is also monotone submodular.

## 3 Technical Overview

We give here an overview of the techniques used in our upper and lower bound results.

### 3.1 Upper Bound on r-round Adaptivity Gap

In this discussion we focus mainly on our non-adaptive () algorithm, which already deviates significantly from the previous work of Goemans and Vondrak [26]. A non-adaptive algorithm simply picks a permutation of items and realize them one by one in a set until . Hence, the “only” task in designing a non-adaptive algorithm is to find a “good” ordering of items, that is, an ordering such that its prefix that covers has a low expected cost.

Consider the following algorithmic task: In the setting of stochastic submodular cover problem, suppose we are given a (ordered) set S of stochastic items. Can we pick a low-cost (ordered) set of stochastic items non-adaptively (without looking at a realization of S or ) so that the coverage of is sufficiently larger than S, i.e., is large? Assuming we can do this, we can use this primitive to find sets with large coverage non-adaptively and iteratively, by starting from the empty-set and using this primitive to increase the coverage further repeatedly.

Recall that in the non-stochastic setting, the greedy algorithm is precisely solving this problem, i.e., finds a set such that , where with a slight abuse of notation, OPT here denotes the optimal non-stochastic cover of . This suggests that one can always find a “low” cost set with a large marginal contribution to . For the stochastic problem, however, it is not at all clear whether there always exists a “low cost” (compared to adaptive OPT) whose expected marginal contribution to is large. This is because there are many different realizations possible for S, and each realization , in principle may require a dedicated set of items to achieve a large value . As such, while adaptive OPT can first discover the realization of S and based on that choose to increase the expected coverage, a non-adaptive algorithm needs to instead pick , which can have a much larger cost (but the same marginal contribution). This suggests that cost of non-adaptive algorithm can potentially grow with the size of all possible realizations of . We point out that this task remains challenging even if all remaining inputs other than S are non-stochastic, i.e., always realize to a particular item.

Nevertheless, it turns out that no matter the size of the set of all realizations of S, one can always find a set of stochastic items such that while , i.e., achieve a marginal contribution proportional to while paying cost which is times larger than OPT (here OPT corresponds to an optimal adaptive algorithm corresponding the residual problem of covering ). Compared to the non-stochastic setting, this cost is times larger than the analogous cost in the non-stochastic setting (see Example 4.1). This part is one of the main technical ingredients of our paper (see Theorem 1). We briefly describe the main ideas behind this proof.

The idea behind our algorithm is to sample several realizations from S and pick a low cost dedicated set for each such that is large (here, the randomness is only on realizations of ). This step is quite similar to solving the non-adaptive submodular maximization problem with knapsack constraint for which we design a new algorithm based on an adaptation of Wolsey’s LP [44] (see Theorem 2 and discussion before that for more details and comparison with existing results). This allows us to bound the cost of each set by . The final (ordered) set returned by this algorithm is then . The ordering within items of does not matter.

The main step of this argument is however to bound the value of , i.e., the number of samples, by . This step is done by bounding the total contribution of sets on their own, i.e., independent of the set S. The intuition is that if we choose, say , with respect to some realization of S, but does not have a marginal contribution to most realizations of S, then this means that by picking another set , the set needs to have a coverage larger than both and . As a result, if we repeat this process sufficiently many times, we should eventually be able to increase , simply because otherwise , a contradiction.

We now use this primitive to design our non-adaptive algorithm as follows: we keep adding set of items to the ordering using the primitive above in iterative phases. In each phase , we run the above primitive multiple times to find a set with , where is the event that the realization of items picked in previous phases of the algorithm did not cover entirely. We further bound the cost of the set with the expected cost of OPT conditioned on the event , i.e., . Notice that this quantity can potentially be much larger than the expected cost of OPT. However, since the probability that in the permutation returned by the non-adaptive algorithm, we ever need to realize the sets in is bounded by , we can pay for the cost of these sets in expectation. By repeating these phases, we can reduce the probability of not covering exponentially fast and finalize the proof.

We then extend this algorithm to an -round adaptive algorithm for any . For simplicity, let us only mention the extension to rounds (extending to is then straightforward). We spend the first round to find a (ordered) set S with with high probability for any realizations of S. We extend our main primitive above to ensure that if , then we can find a set with and (as opposed to in the original statement). This is achieved by the fact that when the deficit is sufficiently large then the rate of coverage per cost is higher, as opposed to when the deficit is very small. Precisely, we exploit the fact that the gap of is sufficiently large to reach the contradiction in the original argument with only sets . We then run the previous algorithm using this primitive by setting the threshold . In the next round, we simply run our previous algorithm on the function where is the realization in the first round. As has maximum value at most , by the previous argument we only need to pay times expected cost of OPT, hence our total cost is . Extending this approach to -round algorithms is now straightforward using similar ideas as the thresholding greedy algorithm for set cover (see, e.g. [18]).

### 3.2 Lower Bound on Adaptivity Gap

We prove our lower bound for the stochastic set cover problem, a special case of stochastic submodular cover problem (see Example 1.1). Let us first sketch our lower bound for two round algorithms. Let be a collection of sets to be determined later (recall that is the size of the universe we aim to cover). Consider the following instance of stochastic set cover: there exists a single stochastic set which realizes to one set chosen uniformly at random from sets , i.e., complements of the sets in . We further have additional stochastic sets where realizes to for chosen uniformly at random from . Finally, for any element , we have a set with only one realization which is the singleton set (i.e., always covers ).

Consider first the following adaptive strategy: pick in the first round and see its realization, say, . Pick in the second round and see its realization, say . Pick in the third round. This collection of sets is , hence it is a feasible cover. As such, in only rounds of adaptivity, we were able to find a solution with cost only .

A two-round algorithm is however one round short of following the above strategy. One approach to remedy this would be try to make a “shortcut” by picking more than one sets in each round of this process, e.g., pick the set also in the first round. However, it is easy to see that as long as we do not pick sets in the first round, or sets in the second round, we have a small chance of making such a shortcut. We are not done yet as it is possible that the algorithm covers the universe using entirely different sets (i.e., do not follow this strategy). To ensure that cannot help either, we need the sets in to have “minimal” intersection; this in turns limits the size of each set and hence the eventual lower bound we obtain using this argument.

We design a family of instances that allows us to extend the above argument to -round adaptive algorithms. We construct these instances using the edifice set-system of Chakrabarti and Wirth [15] that poses a “near laminar” property, i.e., any two sets are either subset-superset of one another or have “minimal” intersection. We remark that this set-system was originally introduced by [15] for designing multi-pass streaming lower bounds for the set cover problem. While the instances we create in this work are similar to the instances of [15], the proof of our lower bound is entirely different (lower bound of [15] is proven using a reduction in communication complexity).

## 4 The Non-Adaptive Selection Algorithm

We introduce a key primitive of our approach in this section for solving the following task: Suppose we have already chosen a subset of items but we are not aware of the realization of these items; our goal is to non-adaptively add another set to S to increase its expected coverage. Formally, given any monotone submodular function , let be the required coverage on . Also, for any realization of S, we use to refer to the deficit in covering , and denote by the expected deficit of set S. Our goal is now to add (still non-adaptively) a “low-cost” (compared to adaptive OPT) set to S to decrease the expected deficit. It is easy to see that such a primitive would be helpful for finding sets with “large” coverage non-adaptively and iteratively, by starting from the empty-set and use this primitive to reduce the deficit further by picking another set and then repeat the process starting from this set.

Let us start by giving an example which shows some of the difficulty of this task.

[hidealllines=false,backgroundcolor=gray!10,innertopmargin=0pt]

###### Example 4.1.

Consider an instance of stochastic set cover: there exists a single set, say which realizes to for an element chosen uniformly at random from and singleton sets , each covering a unique element in . If we have already chosen , and want to chose more sets in order to decrease the expected deficit, then it is easy to see that even though the cost of OPT is only , no collection of sets can decrease the expected deficit by one. This should be contrasted with the non-stochastic setting in which there always exists a single set that reduces a deficit of by .

We are now ready to state our main result in this section.

###### Theorem 1.

Let X be a collection of items, and let be any monotone submodular function such that for every realization of . Let be any subset of items and define . Given any parameter , there is a randomized non-adaptive algorithm that outputs a set such that cost of is in expectation over the randomness of the algorithm and over the randomness of the algorithm and realizations of S and . Here OPT is an optimal fully-adaptive algorithm for the stochastic submodular cover problem with the function 666Throughout this paper we will abuse notation by refering to an optimal fully-adaptive algorithm for different problem instances using the same notation OPT. The specific problem instance will be clear from context..

The goal in Theorem 1, is to select a set of items that can decrease the deficit of a typical realization of S (i.e., the expected deficit). In order to do so, we first design a non-adaptive algorithm that finds a low-cost set that can decrease the deficit of a particular realization of S. This step is closely related to solving a stochastic submodular maximization problem subject to a knapsack constraint. Indeed, when costs of all the items are the same, i.e., when we want to minimize the number of items in the solution, one can use the algorithm of [4] (with some small modification) for stochastic submodular maximization subject to cardinality constraint for this purpose. Also, when the random variables ’s have binary realizations, i.e. take only two possible values, then one can use the algorithm of [30] for this purpose. However, we are not aware of a solution for the knapsack constraint of the problem in its general form with the bounds required in our algorithms, and hence we present an algorithm for this task as well. The main step of our argument is however on how to use this algorithm to prove Theorem 1, i.e., move from per-realization guarantee, to the expectation guarantee.

### 4.1 A Non-Adaptive Algorithm for Increasing Expected Coverage

We start by presenting a non-adaptive algorithm that picks a low-cost (compared to the expected cost of OPT) set of items deterministically, while achieving a constant factor of coverage of OPT. For any set , i.e., the set of indices of stochastic items, and any realization of X, we define , i.e, the realization of all items corresponding to indices in .

###### Theorem 2.

There exists a non-adaptive algorithm that takes as input a set of items , a monotone submodular function , and a parameter such that for any realization of , and outputs a set such that and . Here, OPT is the optimum adaptive algorithm for submodular cover on with function and parameter .

As argued before, Theorem 2 can be interpreted as an algorithm for submodular maximization subject to knapsack constraint.

To prove Theorem 2

, we design a simple greedy algorithm (similar to the greedy algorithm for submodular maximization) and analyze it using a linear programming (LP) relaxation in the spirit of Wolsey’s LP

[44] defined in the following section.

### Extension of Wolsey’s LP for Stochastic Submodular Cover

Let us define the function as follows: for any ,

 F(A):=EXA∼{X}[f(XA)]. (1)

As we assume in the lemma statement that , we have as well. For any , we further define the marginal contribution function where for all . The following proposition is straightforward.

###### Proposition 4.1.

Function is a monotone submodular function.

###### Proof.

is a convex combination of submodular functions, one for each realization of .

We will use a linear programming (LP) relaxation in the spirit of Wolsey’s LP [44] for the submodular cover problem (when applied to the function ). Consider the following linear programming relaxation: [ enlarge top by=5pt, enlarge bottom by=5pt, breakable, boxsep=0pt, left=4pt, right=4pt, top=10pt, arc=0pt, boxrule=1pt,toprule=1pt, colback=white ]

 P=miny∈[0,1]mm∑i=1ci⋅yi s.t.∑i∈[m]∖AFA(i)⋅yi≥˜Q−2F(A),∀A⊆[m] (2)

The difference between LP (2) and Wolsey’s LP is in RHS of the constraint which is in case Wolsey’s LP. In the non-stochastic setting, one can prove that Wolsey’s LP lower bounds the value of optimum submodular cover for function . To extend this result to the stochastic case (for the function ) however, it suffices to modify the constraint as in LP (2), as we prove in the following lemma.

###### Lemma 4.2.

The cost of an optimal adaptive algorithm OPT for submodular cover on function is lower bounded by the optimal cost of LP (2), i.e. .

###### Proof.

For a realization of and any

, define an indicator random variable

that takes value iff OPT chooses on the realization , i.e.

 wi(X)=1[Xi∈{OPT}(X)].

Let be the probability that OPT chooses , i.e.,

 wi=PrX∼{X}(wi(X)=1)=E[wi(X)].

We have that,

 E[{cost}(%OPT)] =EX[m∑i=11[Xi∈{OPT}(X)]⋅ci]=m∑i=1wi⋅ci.

In the following, we prove that is a feasible solution to LP (2), which by above equation would immediately imply that .

Clearly , so it suffices to prove that the constraint holds for any set . The main step in doing so is the following claim.

###### Claim 4.3.

For any set , and any two realizations and of X:

 f(XA)+f(X′A)+∑i∈[m]∖AfX′A(xi)⋅wi(X)≥˜Q.
###### Proof.

Recall that we assume always, and hence as well. Moreover, for any , and for , . We further define the sets:

 B:={OPT}(X)∩AandC:={OPT}(X)∖B.

We have,

 f(XA)+f(X′A)+∑i∈[m]∖AfX′A(xi)⋅wi(X) =f(XA)+f(X′A)+∑xi∈CfX′A(xi) ≥Fact~{}???f(XA)+f(X′A∪C) (by submodularity) ≥f(XB)+f(XC) (by monotonicity as XB⊆XA) =f(XB∪XC)=˜Q, (by submodularity and since XB∪XC=OPT(X))

which finalizes the proof.

Fix any set . We first take an expectation over all realizations of in LHS of Claim 4.3:

 ˜Q ≤Claim~{}???EX[f(XA)+f(X′A)+∑i∈[m]∖AfX′A(xi)⋅wi(X)] =EX[f(XA)]+f(X′A)+∑i∈[m]∖AEX[fX′A(xi)⋅wi(X)] =EX[f(XA)]+f(X′A)+∑i∈[m]∖AEX[fX′A(xi)]⋅EX[wi(X)],

as random variables and are independent since the choice of by OPT is independent of what realizes to. We further point out that in the RHS of last equation above is equal to by definition in Eq (1) and .

We further take an expectation over all realizations of in the RHS above:

 ˜Q ≤EX′[F(A)+f(X′A)+∑i∈[m]∖AEX[fX′A(xi)]⋅wi] =Eq~{}(???)F(A)+F(A)+∑i∈[m]∖AEX′EX[fX′A(xi)]⋅wi =2⋅F(A)+∑i∈[m]∖AFA(i)⋅wi,

as . Rewriting the above equation, we obtain that the constraint associated with set is satisfied by . This concludes the proof that is a feasible solution.

We now design an algorithm, namely non-adapt-greedy, based on “the greedy algorithm” (for submodular optimization) applied to the function in the last section and then use LP (2) to analyze it. We emphasize that the use of the LP is only in the analysis and not in the algorithm.

[ enlarge top by=5pt, enlarge bottom by=5pt, breakable, boxsep=0pt, left=4pt, right=4pt, top=10pt, arc=0pt, boxrule=1pt,toprule=1pt, colback=white ] . Given a monotone submodular function , the set of stochastic items X, and a parameter for all , outputs a set of (indices of) stochastic items.

1. Initialize: Set and be the function associated to in Eq (1).

2. While do:

1. Let .

2. Update .

3. Output: .

It is clear that the set output by non-adapt-greedy achieves (as , the termination condition would always be satisfied eventually). We will now bound the cost paid by the greedy algorithm in terms of the optimal value of LP (2).

###### Lemma 4.4.

.

To prove Lemma 4.4 we need some definition. Let the sequence of items picked by the greedy algorithm be , where is the index of the item picked in iteration . Moreover, for any , define , i.e., the set of items chosen before iteration . We first prove the following bound on the ratio of coverage rate to costs in each iteration.

###### Lemma 4.5.

In each iteration of the non-adaptive greedy algorithm we have,

 FA

where is the optimal value of LP (2).

###### Proof.

Fix any iteration . Recall that in each iteration, we pick the item . Suppose towards a contradiction that in some iteration :

 ∀j∈[m]FA

Let be an optimal solution to LP (2), then by the constraint of the LP for set we have

 ˜Q−2F(A

where the last equality is because by definition . By above equation, , a contradiction.

###### Proof of Lemma 4.4.

Fix any iteration in the algorithm where . By Lemma 4.5,

 FA

Let be the first index where but (i.e., the iteration the algorithm terminates). Note that . We start by bounding the first terms in :

 ˜Q/3>F(A

Now consider the last term in , i.e., . Again, by Lemma 4.5, we have,

 cjk ≤Lemma~{}???FA

using the fact that . As such, finalizing the proof.

Theorem 2 now follows immediately from Lemma 4.4 and Lemma 4.2 as .

### 4.2 Proof of Theorem 1

We use the algorithm in Theorem 2 to present the following algorithm for reducing the expected deficit of any given set S in Theorem 1.

[ enlarge top by=5pt, enlarge bottom by=5pt, breakable, boxsep=0pt, left=4pt, right=4pt, top=10pt, arc=0pt, boxrule=1pt,toprule=1pt, colback=white ] . Given a collection of indices , a monotone submodular function with for every , collection of items with expected deficit , picks a set of items to decrease the expected deficit.

1. Let .

2. For do:

1. Sample a realization .

2. (recall that ).

3. Return all items in the sets .

The Select algorithm repeatedly calls the non-adapt-greedy algorithm for samples drawn from realizations of the set S. By Fact 2.2, for any realization of S, is also a monotone submodular function. Moreover, by the assumption that always, we have that always as well. Hence, the parameters given to function non-adapt-greedy in Select are valid.

We first bound the expected cost of Select.

.

###### Proof.

Cost of is the cost of the sets chosen by non-adapt-greedy on for each of the realizations of S. By Theorem 2, we can bound the cost of each using OPT conditioned on realization for S (as we consider ). As such,

 E[{cost}(T)] =Ψ∑i=1ESi∼{S}[{cost}(Ti)] ≤(a)Ψ∑i=1ESi∼{S}[3⋅EX[{cost}({OPT}(X))∣{S}=Si]] =Ψ∑i=13⋅ESi∼{S}EX∼{X}∣Si[{cost}({OPT}(X))] =3Ψ⋅EX[{% cost}({OPT}(X))],

where the inequality follows from Theorem 2 because even though the OPT used in Theorem 2 is an optimal algorithm on the problem instance , but the cost of can only be larger than the cost of OPT on the instance . The bound now follow from the value of .

We now prove that the expected deficit of is dropped by at least a factor. The following lemma is at the heart of the proof.

.

###### Proof.

We start by introducing the notation needed in the proof. It is useful to note that the randomness in is due to two sources: (1) the sample which determines which sets are indexed by ; and (2) the randomness in the realization of the sets indexed by . For any realization of S, we use to denote the set chosen (deterministically now by non-adapt-greedy) conditioned on (this corresponds to “fixing” the first source of randomness above). We use the notation to denote the collection of sets selected in iterations through , and to denote the tuple of realizations (we define and