# A Parallel Algorithm for Minimum Cost Submodular Cover

In a minimum cost submodular cover problem (MinSMC), given a monotone non-decreasing submodular function f 2^V →ℤ^+, a cost function c: V→ℝ^+, an integer k≤ f(V), the goal is to find a subset A⊆ V with the minimum cost such that f(A)≥ k. MinSMC has a lot of applications in machine learning and data mining. In this paper, we design a parallel algorithm for MinSMC which obtains a solution with approximation ratio at most H(min{Δ,k})/1-5ε with probability 1-3ε in O(log mlog nlog^2 mn/ε^4) rounds, where Δ=max_v∈ Vf(v), H(·) is the Hamornic number, n=f(V), m=|V| and ε is a constant in (0,1/5). This is the first paper obtaining a parallel algorithm for the weighted version of the MinSMC problem with an approximation ratio arbitrarily close to H(min{Δ,k}).

## Authors

• 4 publications
• 62 publications
08/02/2019

### An Efficient Evolutionary Algorithm for Minimum Cost Submodular Cover

In this paper, the Minimum Cost Submodular Cover problem is studied, whi...
09/18/2021

### Streaming algorithms for Budgeted k-Submodular Maximization problem

Stimulated by practical applications arising from viral marketing. This ...
02/08/2022

### A parallel algorithm for minimum weight set cover with small neighborhood property

This paper studies the minimum weight set cover (MinWSC) problem with a ...
05/30/2019

### Parallel Algorithm for Non-Monotone DR-Submodular Maximization

In this work, we give a new parallel algorithm for the problem of maximi...
12/14/2020

### Minimum Robust Multi-Submodular Cover for Fairness

In this paper, we study a novel problem, Minimum Robust Multi-Submodular...
06/30/2021

### The Power of Adaptivity for Stochastic Submodular Cover

In the stochastic submodular cover problem, the goal is to select a subs...
12/24/2018

### Approximating activation edge-cover and facility location problems

What approximation ratio can we achieve for the Facility Location proble...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Recently, submodular optimization has attracted a lot of interest in machine learning and data mining, where it has been applied to a variety of problems including viral marketing [13], information gathering [12]

[8], etc.

In this paper, we study parallel algorithm for the minimum cost sumbodular cover problem (MinSMC). Given a monotone nondecreasing submodular function , a cost function , an integer , the goal of MinSMC is to find a subset with the minimum cost such that , where the cost of is . MinSMC has numerous applications, including data summarization [15], recommender systems [6], etc. For example, given a set of data, it is desirable to select a cheapest set of data whose utility meets a lower bound of requirement. A lot of commonly used utility functions exhibit submodularity, a natural diminishing returns property, leading MinSMC problems [10]. For MinSMC, a centralized greedy algorithm [16] is known to have approximation ratio , where is the th Harmonic number and .

However, facing with massive data, sequential and centralized greedy method is impractical. Parallel methods have been proposed recently. The best known parallel algorithm for the unweighted MinSMC problem was presented by Fahrbach et al. in [7], which produces a solution of size at most in at most rounds, where is an optimal solution. Note that the algorithm in [7] only deals with the unweighted MinSMC problem. Furthermore, the approximation ratio is and might be as large as , while might be much smaller than . Such an observation motivates us to study NC parallel algorithm for the weighted MinSMC problem, trying to obtain approximation ratio arbitrarily close to .

### 1.1 Related Works

For the MinSMC problem, Wolsey [16] presented a greedy algorithm with approximation ratio , where .

Mirzasoleiman et al. [10] proposed a distributed algorithm for the un-weighted MinSMC problem called DisCOVER, which reduces the problem into a set of cardinality-constrained submodular maximization problems. Employing a greedy algorithm for the cardinality-constrained submodular maximization problem, for any fixed constant , DisCOVER can find a solution with size in rounds of messages, where denotes the number of machines. As noted in [11], it is strange that in this result, when the number of machines is increased, the number of of rounds will increase (rather than decrease). Then the authors in [11] improved the result to a distributed -approximation algorithm in at most rounds of messages, where is the number of elements. These algorithms have suboptimal adaptivity complexity because the summarization algorithm of the centralized machine is sequential. The number of rounds in the central machine can be . A parallel algorithm with a low adaptivity complexity was presented in [7] with approximation ratio at most in at most rounds.

For some special cases of the submodular cover problem, parallel algorithms have been studied recently. In particular, for the set cover problem (i.e., find a smallest subcollection of sets that covers all elements), Berger et al. [3] provided the first parallel algorithm with an approximation guarantee similar to that of the centralized greedy algorithm. They used bucketing technique to obtain a -approximation in rounds, where is the total sum of the sets’ size. Rajagopalan and Vazirani [14] improved the number of rounds to at the cost of a larger approximation ratio of . Blelloch et al. [4] further improved the results by obtaining a -approximation algorithm in rounds.

### 1.2 Our contributions and technical overview

In this paper, we design a parallel algorithm for MinSMC, achieving approximation ratio at most with probability at least , which runs rounds, where , and is a constant in . This is the first paper studying parallel algorithm for the weighted version of MinSMC. Furthermore, the approximation ratio in this paper is arbitrarily close to , while the -approximation in [7] only works for the cardinality version and might be much smaller than .

We have tried the following method for MinSMC. Iteratively call the parallel algorithm for the submodular maximization problem with the knapsack constraint in [5] until finding a feasible solution to MinSMC. Note that this method runs in logarithmic number of rounds and the approximation ratio is . To improve the ratio dependence on to a ratio dependence on needs more effort.

This paper combines the ideas of multi-layer bucket in [3], maximal nearly independent set in [4], and random sample in [7]. Note that [3] and [4] deal with the set cover problem. Since submodular cover structure is much more complicated than set-cover structure, the methods in [3] and [4] cannot be directly used on MinSMC. When applied separately, both of them encounter some structural difficulties. The paper [7] deals with a cardinality budgeted version of the submodular maximization problem. To develop its idea to suit for the weighted version of MinSMC, new ideas have to be explored, specially on how to deal with weights.

The remaining part of this paper is organized as follows. The parallel algorithm for MinSMC and analysis is presented in section 2. Section 3 concludes the paper with some discussions on future work.

## 2 Parallel Algorithm and Analysis for MinSMC

### 2.1 Preliminaries

###### Definition 2.1 (submodular and monotone nondecreasing).

Given an element set , and a function , is submodular if for any ; is monotone nondecreasing if for any .

For any set , denote to be the marginal profit of over . Assume . In this paper, is always assumed to be an integer-valued, monotone nondecreasing, submodular function. It can be verified that for any , the marginal profit function is also a monotone nondecreasing, submodular function.

###### Definition 2.2 (Minimum Submodular Cover Problem (MinSMC)).

Given a monotone nondecreasing submodular function , a cost function , an integer , the goal of MinSMC is to find satisfying

 min{c(A):A⊆V,f(A)≥k}, (1)

where .

Define a function as for any subset . When is a monotone nondecreasing submodular function, it can be verified that is also a monotone nondecreasing submodular function. Note that , and for the modified MinSMC problem

 min{c(A):g(A)=g(V)}, (2)

a set is feasible to (2) if and only if is feasible to (1). Hence problems (1) and (2) are equivalent in terms of approximability, that is, is an -approximate solution to problem (2) if and only if is an -approximate solution to problem (1). In the following, we concentrate on the modified MinSMC problem (2).

The concept of -maximal nearly independent set (-MaxNIS) plays a crucial role in the analysis of the parallel algorithm proposed in [4] for the minimum set cover problem. This paper uses a slightly different concept which only needs nearly independent property.

###### Definition 2.3 (ε-nearly independent set (ε-Nis)).

For a real number and a set , we say that a set is an -NIS with respect to and if satisfies the following nearly independent property:

 gS(J)≥(1−ε)2∑v∈JgS(v). (3)

### 2.2 Algorithm

The main algorithm is described in Algorithm 1. In line 1 to line 7, the instance is preprocessed, the purpose of which is to ensure that the modified instance satisfies , so that the number of rounds can be bounded by the input size, where and are the maximum and the minimum cost of elements, respectively. Sub-procedure MinSMC-Par (described in Algorithm 2) is called in line 8 of Algorithm 1.

Algorithm 2 (MinSMC-Par) deals with the modified instance . It divides the elements into buckets, first by marginal profit-to-cost ratio, then by marginal profit (see line 10 of Algorithm 2). Priority is given to those buckets with higher profit-to-cost ratio. For those buckets with the same profit-to-cost ratio, priority is given to those buckets with higher marginal profit. Algorithm 2 processes the buckets in decreasing priority. Note that after some sets are chosen, an element in a bucket of higher priority may drop into a bucket with lower priority. For each bucket, Algorithm 2 tries to find an -NIS using procedure NIS (described in Algorithm 3).

In the th while-loop of Algorithm 3, an -NIS with respect to is found. After while-loops, an -NIS with respect to is obtained. For each while-loop of Algorithm 3, a for-loop is used to guess the size of the -NIS with respect to . In the for-loop, a mean operation described in Algorithm 4 is called. As will be shown, if is correctly guessed, then , and the random set sampled in line 22 satisfies the property required by a nearly independent set. A set consisting of elements is abbreviated as a -set. When we say “select a -set from uniformly and randomly”, it means that elements are selected sequentially from until we have elements at hand. So, any specific -set appears with probability . Note that viewing as an ordered set will facilitate its selection as well as the probabilistic computations.

Algorithm 4 uses the mean value of function to measure the expected quality of a sampled set, where is a random indicator function defined as follows. Given two sets , a parameter , and a real number , for a random -set which is selected from uniformly and randomly, and an element which is drawn uniformly at random from ,

 It,B,A,τ,ϵ(X,x)=I[gB∪X(x)≥(1−ε)τ],

that is, if , and otherwise. As a convention,

 if A∖X=∅, define It,B,A,τ,ϵ(X,x)=0. (4)

The next lemma shows that the expectation of is monotone non-increasing with respect to the sample size . Since what matters in this lemma is the sample size , we use for abbreviation of .

###### Lemma 2.4.

Given , suppose and are two integers with . Then

 E[It(X,x)]≥E[It′(X′,x′)].
###### Proof.

Assume . It can be calculated that

 E[It′(X′,x′)]= ∑X′={x1,…,xt′},x′I[gB∪{x1,…,xt′}(x′)≥(1−ε)τ]P[x1,…,xt′,x′ is picked] = 1a×(a−1)×⋯×(a−t′)∑X′={x1,…,xt′},x′I[gB∪{x1,…,xt′}(x′)≥(1−ε)τ] ≤ 1a×⋯×(a−t′)∑X′={x1,…,xt′},x′I[gB∪{x1,…,xt}(x′)≥(1−ε)τ] = [a−(t+1)]×[a−(t+2)]×⋯×(a−t′)a×⋯×(a−t′)∑X={x1,…,xt},x′I[gB∪{x1,…,xt}(x′)≥(1−ε)τ] = 1a(a−1)⋯(a−t)∑X={x1,…,xt},x′I[gB∪{x1,…,xt}(x′)≥(1−ε)τ] ≤ E[It(X,x)],

where the first inequality comes from the submodularity of function , and the last inequality holds because is sampled from and function is nonnegative. ∎

The following lemma shows that the mean value in line 16 of Algorithm 3 can be used to bound .

###### Lemma 2.5.

With probability at least , if , and if .

###### Proof.

Let , and let . By the Chenorff bound (see [9]), for any ,

 P[|Ym−mμ|≥a]≤2e−a22mμ. (5)

For , using and , we have

 a22mμ≥log(2/δ). (6)

Combining inequalities (5) and (6), we have . That is, . If , then , that is, with probability at least , we have . The second half of the lemma can be proved similarly. ∎

### 2.3 Performance analysis

The next lemma shows that the expected size of decreases exponentially, which implies that in Algorithm 3, a bucket will become empty in at most rounds.

###### Lemma 2.6.

If the for loop of Algorithm 3 is broken because of line 17, then with probability at least .

###### Proof.

The inequality is obvious if . In the following, assume .

By the assumption of this lemma, we have . Then by Lemma 2.5, with probability at least ,

 E[Itp,Bp,Ap,τ,ϵ(T,x)]≤1−¯ε. (7)

Note that after is picked, is included into only when , also note that this term is zero if , so

 |Ap+1| =∑x∈Ap∖TpI[gBp∪Tp(x)≥(1−ε)τ,gBp∪Tp(x)/c(x)≥(1−ε)β] ≤∑x∈Ap∖TpI[gBp∪Tp(x)≥(1−ε)τ].

It follows that

 E[|Ap+1||Ap∖Tp|]= ∑TpP[Tp is picked]E[|Ap+1||Ap∖Tp||Tp] ≤ ∑TpP[Tp is picked]⎛⎝∑x∈Ap∖TpP[x is picked|Tp]I[gBp∪Tp(x)≥(1−ε)τ]|Ap∖Tp|⎞⎠ = ∑Tp,x∈Ap∖TpP[Tp,x are picked]I[gBp∪Tp(x)≥(1−ε)τ]|Ap∖Tp| ≤ ∑Tp,x∈Ap∖TpP[Tp,x are picked]I[gBp∪Tp(x)≥(1−ε)τ] = E[Itp,Bp,Ap,τ,ϵ(T,x)],

where the second inequality uses the observation (since ). Combining this with inequality 7, we have . Thus . The lemma is proved. ∎

For clarity of statement, we call the bucket in line 15 of Algorithm 2 as a subordinate bucket and the bucket as a primary bucket. The following lemma says that for any and , when line 14 of Algorithm 2 outputs set , the subordinate bucket becomes empty with a certain probability.

###### Lemma 2.7.

When Algorithm 3 reaches line 29, the set in line 10 is empty with probability at least .

###### Proof.

For a , if the for loop is executed rounds, then , and becomes empty. Next, consider the case when the for loop breaks because of line 17. Denote to be the event . By Lemma 2.6, . By the union bound,

 P[C1∩C2∩…∩Cr]= 1−P[¯C1∪¯C2∪…∪¯Cr] ≥ 1−r∑i=1P(¯Ci) ≥ 1−ε/(2nT2ℓ).

So, with probability at least , we have

 E[|Ar|]≤(1−¯ε)r⋅E[|A1|]≤ε/(2Tℓ).

Denote the event to be . We have proved . Using Markov’s inequality, . So, . The lemma is proved. ∎

The following corollary shows that when the inner while loop of Algorithm 2 halts, the primary bucket is empty with a certain probability.

###### Corollary 2.8.

After is computed in line 14 of Algorithm 2, the primary bucket is empty with probability at least .

###### Proof.

For , let be the event of . Lemma 2.7 says that after is computed in line 14 of Algorithm 2. Let be the event of . Then . So, after is computed,

 P(D)= 1−P[¯C1t∪¯C2t∪…∪¯Cℓt] ≥ 1−ℓ∑i=1P(¯Cit) ≥ 1−ε/T2.

The lemma is proved. ∎

The next lemma shows that with a certain probability, the set computed in line 14 of Algorithm 2 satisfies the nearly independent property defined in (3).

###### Lemma 2.9.

with probability at least .

###### Proof.

Consider the call of Algorithm 3 when the input parameter is . Note that . We first prove that with probability at least , the set sampled in line 22 of Algorithm 3 satisfies

 E[gBp(Tp)]≥(1−ε)2∑v∈TpgB1(v), (8)

where is the set in line 23. For clarity of statement, denote the size of as . Inequality (8) is obviously true if or . Next, suppose . Note that line 22 of Algorithm 3 is executed after the for loop is jumped out. Further note that the jump-out must be because of line 17. In fact, if the number of iterations of the for loop has reached , then , and every in Algorithm 4 is , resulting in (see (4)), at which time the condition of line 17 is satisfied. In the previous round of the for loop, that is, when tries the value , we must have , and thus

 E[I¯t,Bp,Ap,τ,ϵ(X,x)]≥1−2¯ε (9)

by Lemma 2.5. Assume that , and for any , denote . By the monotonicity of , we have

 E[gBp(Tp)]≥E[gBp(T¯tp)]=¯t∑i=1E[gBp∪Ti−1p(vi)]. (10)

By the definition of and Markov’s inequality,

 E[Ii,Bp,Ap,τ,ϵ(Tip,vi+1)]=P[gBp∪Tip(vi+1)≥(1−ε)τ]≤E[gBp∪Tip(vi+1)](1−ε)τ. (11)

Combining inequalities 10 and 11, we have

 E[gBp(Tp)]≥(1−ε)τ⋅¯t∑i=1E[Ii,Bp,Ap,τ,ϵ(Tip,vi+1)]. (12)

For any , by Lemma 2.4 and inequality (9), with probability at least

 E[Ii,Bp,Ap,τ,ϵ(Tip,vi+1)]≥1−2¯ε. (13)

Combining inequalities (12), (13) and the union bound, with probability at least

 E[gBp(Tp)]≥ (1−2¯ε)¯t(1−ε)τ (14) = t∗1+¯ε(1−2¯ε)(1−ε)τ ≥ (1−ε)2t∗τ,

where the last inequality comes from the choice of . By line 10 and line 14 of Algorithm 2, when the computation enters Algorithm 3, we have for any in the input set , with respect to the input parameter . It follows that holds for every . Combining this with (14), inequality 8 is proved.

Then, by the union bound, and similar to the proof of Corollary 2.8, with probability at least ,

 r∑p=1E[gBp(Tp)]≥(1−ε)2r∑i=p∑v∈TpgB1(v) (15)

Combining this with , with probability at least

 E[gBt′t(Jt′t)]= r∑p=1E[gBp(Tp)] ≥ (1−ε)2r∑p=1∑v∈TpgB1(v) = (1−ε)2∑v∈Jt′tgBt′t(v),

where the last inequality comes from and . ∎

For simplicity of statement, we assume that every inner while loop is executed times. Denote by for any . The following corollary shows that the expected cost effectiveness of decreases geometrically.

###### Corollary 2.10.

For any , with probability at least .

###### Proof.

By Lemma 2.9 and the union bound, with probability at least ,

 ℓ∑t′=1E[gBt′t(Jt′