On Approximating Partial Set Cover and Generalizations

Partial Set Cover (PSC) is a generalization of the well-studied Set Cover problem (SC). In PSC the input consists of an integer k and a set system (U,S) where U is a finite set, and S ⊆ 2^U is a collection of subsets of U. The goal is to find a subcollection S' ⊆ S of smallest cardinality such that sets in S' cover at least k elements of U; that is |∪_A ∈ S' A| > k. SC is a special case of PSC when k = |U|. In the weighted version each set X ∈ S has a non-negative weight w(X) and the goal is to find a minimum weight subcollection to cover k elements. Approximation algorithms for SC have been adapted to obtain comparable algorithms for PSC in various interesting cases. In recent work Inamdar and Varadarajan, motivated by geometric set systems, obtained a simple and elegant approach to reduce PSC to SC via the natural LP relaxation. They showed that if a deletion-closed family of SC admits a β-approximation via the natural LP relaxation, then one can obtain a 2(β + 1)-approximation for PSC on the same family. In a subsequent paper, they also considered a generalization of PSC that has multiple partial covering constraints which is partly inspired by and generalizes previous work of Bera et al on the Vertex Cover problem. Our main goal in this paper is to demonstrate some useful connections between the results in previous work and submodularity. This allows us to simplify, and in some cases improve their results. We improve the approximation for PSC to (1-1/e)(β + 1). We extend the previous work to the sparse setting.



There are no comments yet.


page 1

page 2

page 3

page 4


On Partial Covering For Geometric Set Systems

We study a generalization of the Set Cover problem called the Partial Se...

On the Partition Set Cover Problem

Various O( n) approximations are known for the Set Cover problem, where ...

ℓ_1-sparsity Approximation Bounds for Packing Integer Programs

We consider approximation algorithms for packing integer programs (PIPs)...

A smaller cover for closed unit curves

Forty years ago Schaer and Wetzel showed that a 1/π×1/2π√(π^2-4) rectang...

Selective Classification via One-Sided Prediction

We propose a novel method for selective classification (SC), a problem w...

Threshold Rounding for the Standard LP Relaxation of some Geometric Stabbing Problems

In the rectangle stabbing problem, we are given a set of axis-aligned r...

Approximability of all finite CSPs in the dynamic streaming setting

A constraint satisfaction problem (CSP), Max-CSP( F), is specified by a ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Set Cover

is a well-studied problem in combinatorial optimization. The input is a set system

consisting of a finite set and a collection of subsets of . The goal is to find a minimum cardinality subcollection such that is covered by sets in . In the weighted version each has a weight and the goal is to find a minimum weight subcollection of sets whose union is . Set Cover is NP-Hard and approximation algorithms have been extensively studied. A very simple greedy algorithm yields an approximation where and this holds even in the weighted case. Moreover this bound is essentially tight unless [Fei98]. Various special cases of Set Cover have been studied in the literature. A well-known example is the Vertex Cover problem in graphs (VC) which can be viewed as a special case of Set Cover where the frequency of each element is at most (the frequency of an element is the number of sets it is contained in). When the maximum frequency of is , an -approximation can be obtained. Interesting class of Set Cover instances come from various geometric range spaces in low dimensions. A canonical example here is the problem of covering points in the plane by a given collection of disks. This problem admits a constant factor approximation in the weighted case [CGKS12] via a natural LP, and a PTAS in the unweighted case [MR10] via local search; there is also a QPTAS for the weighted case [MRR15]. Closely related to Set Cover are maximization variants, namely, Max -Cover and Max-Budgeted-Cover. In Max -Cover we are given a set system and an integer ; the goal is to pick sets from to maximize the size of their union. In Max-Budgeted-Cover the sets have weights and the goal is to pick a collections of sets with total weight at most a given budget so as to maximize the size of their union. -approximations are known for both these problems [NWF78, KMN99, Svi04] and there are tight unless [Fei98].

Partial Set Cover (Partial-SC):

In Partial-SC the input, in addition to the set system as in Set Cover, also has an integer parameter , and now the goal is to find a minimum (weight) subcollection of the given sets whose union is of size at least . Note that Set Cover is a special case when . It is natural to ask if Partial-SC can be approximated (almost) as well as Set Cover. In several settings this is indeed the case. For instance the greedy algorithm gives the same guarantee for Partial-SC as it does for Set Cover; one can see this transparently by viewing Set Cover and Partial-SC as special case of the Submodular Set Cover problem for which greedy has been analyzed by Wolsey [Wol82]. However, for special cases of Set Cover such as VC one needs more careful analysis to obtain comparable bounds for Partial-SC; we refer the reader to [KPS11] and references therein. Of particular interest to us is the recent result of Inamdar and Varadarajan [IV18a] which gave a simple and intuitive reduction from Partial-SC to Set Cover via the natural LP relaxation. Their black box reduction to Set Cover is particularly useful in geometric settings. Inamdar and Varadarajan show that if there is a -approximation for a deletion-closed class of Set Cover instances111We say tha a family of set systems is deletion closed if removing an element or removing a set from a set system in the family yields another set system in the same family. via the standard LP, then there is approximation for Partial-SC on the same family,via a standard LP relaxation.

In a subsequent paper Inamdar and Varadarajan [IV18b] considered a generalization of Partial-SC when there are multiple partial covering constraints. They call their problem the Partition Set Cover problem (Partition-SC) and were motivated by previous work of Bera et al. [BGKR14] who considered the same problem in the special setting of VC. In this problem the input is a set system , and subsets of , and integers . The goal is to find a minimum cardinality subcollection (or a minimum weight subcollection in the weighted case) such that, for , the number of elements covered by from is at least . For deletion-closed set families that admit a -approximation for Set Cover, [IV18b] obtained an approximation for Partition-SC and this generalizes the results of [BGKR14, IV18a]. In [HK18] the authors describe a primal-dual algorithm that yields an -approximation for Partition-SC where is the maximum frequency; note that [IV18b] implies a ratio of for the same problem where the asymptotic notation hides a constant factor.

Submodular Set Cover and Related Problems:

As we remarked Set Cover and Partial-SC are special cases of Submodular Set Cover. Given a finite ground set a real-valued set function is submodular iff for all . A set function is monotone if for all . We will be mainly interested here in monotone submodular functions that are normalized, that is , and hence are also non-negative. A polymatroid is an integer valued normalized monotone submodular function. In Submodular Set Cover we are given , a non-negative weight function , and a polymatroid via a value oracle. The goal is to solve such that . Set Cover and Partial-SC can be seen as special case of Submodular Set Cover as follows. Given a set system let where . Define the coverage function as: . It is well-known and easy to show that is a polymatroid. Thus Set Cover can be reduced to Submodular Set Cover via the coverage function. To reduce Partial-SC to Submodular Set Cover we let which we refer to as the truncated coverage function. Wolsey showed that a simple greedy algorithm yields a approximation for Submodular Set Cover when is a polymatroid where .

Har-Peled and Jones [HJ18], motivated by an application from computational geometry, implicitly considered the following generalization of Submodular Set Cover. We have a ground set and a weight function as before. Instead of one polymatroid we are given polymatroids over and integers . The goal is to find of minimum weight such that for . We refer to this as MP-Submod-SC. As noted in [HJ18], it is not hard to reduce MP-Submod-SC to Submodular Set Cover. We simply define a new function where . Via Wolsey’s result for Submodular Set Cover this implies an approximation via the greedy algorithm where . Although MP-Submod-SC can be reduced to Submodular Set Cover it is useful to treat it separately when the functions are not general submodular functions, as is the case in Partition-SC.

We mention that prior work has considered multiple submodular objectives from a maximization perspective [CVZ10, CJV15] rather than from a minimum cost perspective. There are useful connections between these two perspectives. Consider Submodular Set Cover. We could recast the exact version of this problem as subject to the constraint where is a budget. This is submodular function maximization subject to a knapsack constraint and admits a -approximation [Svi04].

Covering Integer Programs (CIPs):

A CIP is an integer program of the form where is a non-negative matrix and . CIP s generalize Set Cover and can be seen as a special case of Submodular Set Cover. However, a direction reduction of CIP to Submodular Set Cover requires one to scale the numbers and consequently the greedy algorithm does not yield a good approximation ratio as a function of and . This is rectified via LP relaxations that employ knapsack-cover (KC) inequalities first used in this context by Carr et al. [CFLP00]. Via KC inequalities one obtains refined results for CIPs that are similar to those for Set Cover modulo lower order terms. In particular an approximation can be achieved where is the maximum number of non-zeroes in any column of . We refer the reader to [KY05, CHS16, CQ19] for further results on CIPs.

1.1 Motivation and contributions

Our initial motivation was to simplify and explain certain technical aspects of the algorithm and analysis in [IV18a, IV18b]. We view Partial-SC and Partition-SC as special cases of MP-Submod-SC and use the lens of submodularity and bring in some known tools from this area. This view point sheds light on the properties of the coverage function that lead to stronger bounds than those possible for general submodular functions. A second perspective we bring is from the recent work on CIPs [CQ19] that shows the utility of a simple randomized-rounding plus alteration approach to obtain approximation ratios that depend on the sparsity. Using these two perspectives we obtain some improvements and generalizations of the results in [IV18a, IV18b].

  • For deletion-closed set systems that have a -approximation to Set Cover via the natural LP we obtain a -approximation for Partial-SC. This slightly improves the bound of [IV18a] from while also simplifying the algorithm and analysis.

  • For MP-Submod-SC we obtain a bicriteria approximation. We obtain a random solution such that for and the expected weight of is . We obtain the same bound even in a more general setting where the system of constraints is -sparse. We describe an application of the bicriteria approximation to splitting point sets that was considered in [HJ18].

  • We consider a simultaneous generalization of Partition-SC and CIPs and obtain a randomized approximation where is the sparsity of the system. This generalizes the result of [IV18b] to the sparse setting.

We hope that some of the ideas here are useful in extending work on Partial-SC and generalizations to other special cases of submodular functions.

2 Background

Set Cover and Partial-SC have natural LP relaxations and they are closely related to those for Max -Cover and Max-Budgeted-Cover. The LP relaxation for Set Cover (SC-LP) is shown in Fig 0(a). It has a variable for each set , which, in the integer programming formulation, indicates whether is picked in the solution. The goal is to minimize the weight of the chosen sets which is captured by the objective subject to the constraint that each element is covered. The LP relaxation for Partial-SC (PSC-LP) is shown in Fig 0(b). Now we need additional variables to indicate which of the elements are going to be covered; for each we thus have a variable for this purpose. In PSC-LP it is important to constrain to be at most . The constraint forces at least elements to be covered fractionally.

[width=3in] (SC-LP)

(a) LP relaxation for Set Cover.

[width=3in] (PSC-LP)

(b) LP relaxation for Partial-SC.

As noted in prior work the integrality gap of PSC-LP can be made arbitrarily large but it is easy to fix by guessing the largest cost set in an optimum solution and doing some preprocessing. We discuss this issue in later sections.

Figs 0(c) and 0(d) show LP relaxations for Max -Cover and Max-Budgeted-Cover respectively. In these problems we maximize the number of elements covered subject to an upper bound on the number of sets or on the total weight of the chosen sets.

[width=3in] (MC-LP)

(c) LP relaxation for Max -Cover.

[width=3in] (MBC-LP)

(d) LP relaxation for Max-Budgeted-Cover.

Greedy algorithm:

The greedy algorithm is a well-known and standard algorithm for the problems studied here. The algorithm iteratively picks the set with the current maximum bang-per-buck ratio and add it to the current solution until some stopping condition is met. The bang-per-buck of a set is defined as where is the set of uncovered elements at that point in the algorithm. For minimization problems such as Set Cover and Partial-SC the algorithm is stopped when the required number of elements are covered. For Max -Cover and Max-Budgeted-Cover the algorithm is stopped when if adding the current set would exceed the budget. Since this is a standard algorithm that is extremely well-studied we do not describe all the formal details and the known results. Typically the approximation guarantee of Greedy is analyzed with respect to an optimum integer solution. We need to compare it to the value of the fractional solution. For the setting of the cardinality constraint this was already done in [NWF78]. We need a slight generalization to the budgeted setting and we give a proof for the sake of completeness.

Lemma 2.1.

Let be the optimum value of (MBC-LP) for a given instance of Max-Budgeted-Cover with budget .

  • Suppose Greedy algorithm is run until the total weight of the chosen sets is equal to or exceeds . Then the number of elements covered by greedy is at least .

  • Suppose no set covers more than elements for some then the weight of sets chosen by Greedy to cover elements is at most .

These conclusions holds even for the weighted coverage problem.

Proof: We give a short sketch. Greedy’s analysis for Max-Budgeted-Cover is based on the following key observation. Consider the first set picked by Greedy. Then where OPT is the value of an optimum integer solution. And this follows from submodularity of the coverage function. This observation is applied iteratively with the residual solution as sets are picked and a standard analysis shows that when Greedy first meets or exceeds the budget then the total number of elements covered is at least . We claim that we can replace OPT in the analysis by . Given a fractional solution we see that . Moreover . Via simple algebra, we can obtain a contradiction if holds for all sets . Once we have this property the rest of the analysis is very similar to the standard one where OPT is replaced by .

Now consider the case when no set covers more than elements. If Greedy covers elements before the weight of sets chosen exceeds then there is nothing to prove. Otherwise let be the set added by Greedy when its weight exceeds for the first time. Let be the number of new elements covered by the inclusion of . Since Greedy had covered less than elements the value of the residual fractional solution is at least . From the same argument as the in the preceding paragraph, since Greedy chose at that point, . This implies that . Since Greedy covers at least elements after choosing (follows from the first claim of the lemma), the total weight of the sets chosen by Greedy is at most .

2.1 Submodular set functions and continuous extensions

Continuous extensions of submodular set functions have played an important role in algorithmic and structural aspects. The idea is to extend a discrete set function to the continous space . Here we are mainly concerned with extensions motivated by maximization problems, and confine our attention to two extensions and refer the interested reader to [CCPV07, Von07] for a more detailed discussion.

The multilinear extension of a real-valued set function , denoted by , is defined as follows: For

Equivalently where is a random set obtained by picking each

independently with probability


The concave closure of a real-valued set function , denoted by

, is defined as the optimum of an exponential sized linear program:

A special case of submodular functions are non-negative weighted sums of rank functions of matroids. More formally suppose is a finite ground set and are matroids on the same ground set . Let be the rank functions of the matroids and these are monotone submodular. Suppose where for all , then is monotone submodular. We note that (weighted) coverage functions belongs to this class. For a such a submodular function we can consider an extension where . We capture two useful facts which are shown in [CCPV07].

Lemma 2.2 ([Ccpv07]).

Suppose is the weighted sum of rank functions of matroids. Then . Assuming oracle access to the rank functions , for any , there is a polynomial-time solvable LP whose optimum value is .

Remark 2.3.

Let be the coverage function associated with a set system . Then where and is the rank function of a simple uniform matroid. One can see PSC-LP in a more compact fashion:

Concentration under randomized rounding:

Recall the multilinear extension of a submodular function . If then where is a random set obtained by independently including each in with probability . We can ask whether is concentrated around . And indeed this is the case when is Lipscitz. For a parameter , is -Lipschitz if for all and ; for monotone functions this is equivalent to the condition that for all .

Lemma 2.4 ([Von10]).

Let be a -Lipschitz monotone submodular function. For let be a random set drawn from the product distribution induced by . Then for ,

  • .

  • .

Greedy algorithm under a knapsack constraint:

Consider the problem of maximizing a monotone submodular function subject to a knapsack constraint; formally where is a non-negative weight function on the elements of the ground set . Note that when all and this is the problem of maximizing a monotone submodular function subject to a cardinality constraint. For the cardinality constraint case, the simple Greedy algorithm that iteratively picks the element with the largest marginal value yields a -approximation [NWF78]. Greedy extends in a natural fashion to the knapsack constraint setting; in each iteration the element is chosen where is the set of already chosen elements. Sviridenko [Svi04], building on earlier work on the coverage function [KMN99], showed that Greedy with some partial enumeration yields a -approximation for the knapsack constraint. The following lemma quantifies the performance of the basic Greedy when it is stopped after meeting or exceeding the budget .

Lemma 2.5.

Consider an instance of monotone submodular function maximization subject to a knapsack constraint. Let be the optimum value for the given knapsack budget . Suppose the greedy algorithm is run until the total weight of the chosen sets is equal to or exceeds . Letting be the greedy solution we have .

3 Approximating Partial-SC

In this section we consider the algorithm for Partial-SC from [IV18a] and suggest a small variation that simplifies the algorithm and analysis. The approach of [IV18a] is as follows. Given an instance of Partial-SC with a set system their algorithm has the following high level steps.

  1. Guess the largest weight set in an optimum solution. Remove all elements covered by it, remove all sets with weight larger than the guessed set. Adjust to account for covered elements. We now work with the residual instance of Partial-SC.

  2. Solve PSC-LP. Let be an optimum solution. For some threshold let be the highly covered elements and let be shallow elements.

  3. Solve a Set Cover instance via the LP to cover all elements in . The cost of this solution is at most since one can argue that the fractional solution where for each is a feasible fractional solution for SC-LP to cover .

  4. Let be the residual number of elements that need to be covered from . Round to cover elements from .

The last step of the algorithm is the main technical one, and also determines . In [IV18a] is chosen to be and this leads to their -approximation. The rounding algorithm in [IV18a] can be seen as an adaptation of pipage rounding [AS04] for Max-Budgeted-Cover. The details are somewhat technical and perhaps obscure the high-level intuition that scaling up the LP solution allows one to use a bicriteria approximation for Max-Budgeted-Cover. Our contribution is to simplify the fourth step in the preceding algorithm. Here is the last step in our algorithm; the other steps are the same modulo the specific choice of .

  1. Run Greedy to cover elements from .

We now analyze the performance of our modified algorithm.

Lemma 3.1.

Suppose . Then running Greedy in the final step outputs a solution of total weight at most to cover elements from .

Proof: It is easy to see that since and for each . Let be the set system obtained by restricting to , and let be the restriction of to the set system . We have (i) and (ii) and (iii) for all .

Consider obtained from as follows. For each set and note that . For each set set . It is easy to see that is a feasible solution to PSC-LP. Note that . Let . The fractional solution is also a feasible solution to the LP formulation MBC-LP. We apply Lemma 2.1 to this fractional solution. Suppose we stop Greedy when it covers elements or when it first crosses the budget , whichever comes first. Clearly the total weight is at most . We argue that at least elements are covered when we stop Greedy. The only case to argue is when Greedy is stopped when the weight of sets picked by it exceeds for the first time. From Lemma 2.1 it follows that Greedy covers at least elements but since it implies that Greedy covers at least elements when it is stopped.

We formally state a lemma to bound the cost of covering . We sketch the simple proof for the sake of completeness, it is identical to that from [IV18a].

Lemma 3.2.

The cost of covering is at most .

Proof: Recall that for each . Consider . It is easy to see that is a feasible fractional solution for SC-LP to cover using sets in . Since the set family is deletion-closed, and the integrality gap of the SC-LP is at most for all instances in the family, there is an integral solution covering of cost at most .

Theorem 3.3.

Setting , the algorithm outputs a feasible solution of total cost at most where OPT is the value of an optimum integral solution.

Proof: Fix an optimum solution. Let be the weight of a maximum weight set in the optimum solution. In the first step of the algorithm we can assume that the algorithm has correctly guessed a maximum weight set from the fixed optimum solution. Let . In the residual instance the weight of every set is at most . The optimum solution value for PSC-LP, after guessing the largest weight set and removing it, is at most . From Lemma 3.2, the cost of covering is at most . From Lemma 3.1, the cost of covering elements from is most . Hence the total cost, including the weight of the guessed set, is at most

since .

4 A bicriteria approximation for MP-Submod-SC

In this section we consider MP-Submod-SC. Let be a finite ground set. For each we are given a submodular function . We are also given a non-negative weight function . The goal is to solve the following covering problem:


We say that is active in constraint if , otherwise it is inactive. We say that the given instance is -sparse if each element is active in at most constraints.

Theorem 4.1.

There is a randomized polynomial-time approximation algorithm that given an -sparse instance of MP-Submod-SC outputs a set such that (i) for , and (ii) .

The rest of the section is devoted to the proof of the preceding theorem. We will assume without loss of generality that for each , ; otherwise we can work with the truncated function which is also submodular. This technical assumption plays a role in the analysis later.

We consider a continuous relaxation of the problem based on the multilinear extension. Instead of finding a set we consider finding a fractional point . For any value where OPT is the optimum value of the original problem, the following continuous optimization problem has a feasible solution.

One cannot hope to solve the preceding continuous optimization problem since it is NP-Hard. However the following approximation result is known and is based on extending the continuous greedy algorithm of Vondrak [Von08, CCPV11].

Theorem 4.2 ([Cvz10, Cjv15]).

There is a randomized polynomial-time algorithm that given an instance of MP-Submod-Relax and value oracle access to the submodular functions , with high probability, either correctly outputs that the instance is not feasible or outputs an such that (i) and (ii) for .

Using the preceding theorem and binary search one can obtain an such that and for . It remains to round this solution. We use the following algorithm based on the high-level framework of randomized rounding plus alteration.

  1. Let be random sets obtained by picking elements independently and randomly times according to the fractional solution . Let .

  2. For each if , fix the constraint. That is, find a set using the greedy algorithm (via Lemma 2.5) such that . We implicitly set if .

  3. Output where .

It is easy to see that satisfies the property that for . It remains to choose and bound the expected cost of .

The following is easy from randomized rounding stage of the algorithm.

Lemma 4.3.


We now bound the probability that any fixed constraint is not satisfied after the randomized rounding stage of the algorithm. Let be the indicator for the event that .

Lemma 4.4.

For any , , where for sufficiently small .

Proof: Let be indicator for the event that . From the definition of the multilinear extension, for any , . Hence, . Let . We upper bound as follows. Recall that and hence by monotonicity we have for all . Since we can upper bound by the following:

Rearranging we have . Using the fact that for for sufficiently small , we simplify and see that for sufficiently small . Since the sets are chosen independently,

Remark 4.5.

The simplicity of the previous proof is based on the use of the multilinear extension which is well-suited for randomized rounding. The assumption that is technically important and it is easy to ensure in the general submodular case but is not straightforward when working with specific classes of functions.

Lemma 4.6.

Let be the value of an optimum solution to the problem . Then, .

Proof: Let be an optimum solution to the problem of covering all constraints. Let be the set of active elements for constraint . It follows that is a feasible solution for the problem of covering just . Thus . Hence

We now bound the expected cost of

Lemma 4.7.


Proof: We claim that . Assuming the claim, from the description of the algorithm, we have

Now we prove the claim. Consider the problem . is the optimum solution value to this problem. Now consider the following submodular function maximization problem subject to a knapsack constraint: . Clearly the optimum value of this maximization problem is at least . From Lemma 2.5, the greedy algorithm when run on the maximization problem, outputs a solution such that and . By guessing the maximum weight element in an optimum solution to the maximization problem we can ensure that . Thus, and .

From the preceding lemmas it follows that

We set one can see that

4.1 An application to splitting point sets

Har-Peled and Jones [HJ18], as we remarked, were motivated to study MP-Submod-SC due a geometric application. Their problem is the following. Given point sets in

they wish to find the smallest number of hyperplanes (or other geometric shapes) such that no point set

has more than a constant factor of its points in any cell of the arrangement induced by the chosen hyperplanes; in particular when the constant is a half, the problem is related to the Ham-Sandwich theorem which implies that when just one hyperplane suffices!222A polynomial time algorithm to find such a hyperplane is not known however. From this one can infer that hyperplanes always suffice. Let and let . We will assume, for notational simplicity, that the sets are disjoint. The assumption can be dispensed with. We refer the reader to [HJ18] for connections to Ham-Sandwich theorem and other problems.

In [HJ18] the authors reduce their problem to MP-Submod-SC as follows. Let be the set of all hyperplanes in ; we can confine attention to a finite subset by restricting to those half-spaces that are supported by points of . For each point set they consider a complete graph on the vertex set . For each they define a submodular function where is the number of edges incident to that are cut by ; an edge with is cut if and are separated by at least one of the hyperplanes in . Thus one can formulate the original problem as choosing the smallest number of hyperplanes such that for each the number of edges that are cut is at least where is the demand of . To ensure that is partitioned such that no cell has more than points we set for each ; more generally if we wish no cell to have more than points of we set for each . As a special case of MP-Submod-SC we have


Using Wolsey’s result for Submodular Set Cover, [HJ18] obtain an approximation where .

We now show that one can obtain an -approximation if we settle for a bicriteria approximation where we compare the cost of the solution to that of an optimum solution, but guarantees a slightly weaker bound on the partition quality. This could be useful since one can imagine several applications where , the number of different point sets, is much smaller than the total number of points. Consider the formulation from [HJ18]. Suppose we used our bicriteria approximation algorithm for MP-Submod-SC. The algorithm would cut edges for each and hence for we will only be guaranteed that each cell in the arrangement contains at most points from . This is acceptable in many applications. However, the approximation ratio still depends on since the number of constraints in the formulation is . We describe a related but slightly modified formulation to obtain an -approximation by using only constraints.

Given a collection let denote the number of pairs of points in that are separated by (equivalently the number of edges of cut by ). It is easy to see that is a monotone submodular function over . Suppose induces an arrangement such that no cell in the arrangement contains more than points for some . Then cuts at least edges from ; in particular if then cuts at least edges. Conversely if cuts at least edges for some then no cell in the arrangement induced by has more than points from . Given this we can consider the formulation below.


We apply our bicriteria approximation for MP-Submod-SC with some fixed to obtain an -approximation to the objective but we are only guaranteed that the output satisfies the property that for each . This is sufficient to ensure that no has more than a constant factor in each cell of the arrangement.

The running time of the algorithm depends polynomially on and and can be upper bounded as . The running time in [HJ18] is . Finding a running time that depends polynomially on and is an interesting open problem.

5 Sparsity in Partition-SC

In this section we consider a problem that generalizes Partition-SC and CIPs while being a special case of MP-Submod-SC. We call this problem CCF (Covering Coverage Functions). Bera et al. [BGKR14] already considered this version in the restricted context of VC. Formally the input is a weighted set system and a set of inequalities of the form where matrix and

is a positive vector. The goal is to optimize the integer program CCF-IP shown in Fig 

0(e). Partition-SC is a special case of CCF when the matrix contains only entries. On the other hand CIP is a special case when the set system is very restricted and each set consists of a single element. We say that an instance is -sparse if each set “influences” at most rows of ; in other words the elements of have non-zero coefficients in at most rows of . This notion of sparsity coincides in the case of CIPs with column sparsity and in the case of MP-Submod-SC with the sparsity that we saw in Section 4. It is useful to explicitly see why CCF is a special case of MP-Submod-SC. The ground set corresponds to the sets in the given set system . Consider the row of the covering constraint matrix . We can model it as a constraint where the submodular set function is defined as follows: for a set we let which is simply a weighted coverage function with the weights coming from the coefficients of the matrix . Note that when formulating via these submodular functions, the auxiliary variables that correspond to the elements are unnecessary.

We prove the following theorem.

Theorem 5.1.

Consider an instance of -sparse CCF induced by a set system from a deletion-closed family with a -apprximation for Set Cover via the natural LP. There is a randomized polynomial-time algorithm that outputs a feasible solution of expected cost .

[width=3in] (CCF-IP)

(e) Natural IP for Partition-SC.

[width=3in] (CCF-LP)

(f) Natural LP relaxation for CCF-IP.

The natural LP relaxation for CCF is show in Fig 0(f). It is well-known that this LP relaxation, even for CIPs and with one constraint, has an unbounded integrality gap [CFLP00]. For CIPs knapsack-cover inequalities are used to strengthen the LP. KC-inequalities in this context were first introduced in the influential work of Carr et al. [CFLP00] and have since become a standard tool in developing stronger LP relaxations. Bera et al. [BGKR14] and Inamdar and Varadarajan [IV18b] adapt KC-inequalities to the setting of Partition-SC, and it is straight forward to extend this to CCF (this is implicit in [BGKR14]).

Remark 5.2.

Weighted coverage functions are a special case of sums of weighted rank functions of matroids. The natural LP for CCF can be viewed as using a different, and in fact a tighter extension, than the multilinear relaxation [CCPV07]. The fact that one can use an LP relaxation here is crucial to the scaling idea that will play a role in the eventual algorithm. The main difficulty, however, is the large integrality gap which arises due to the partial covering constraints.

We set up and the explain the notation to describe the use of KC-inequalities for CCF. It is convenient here to use the reduction of CCF to MP-Submod-SC. For row in we will use to denote the submodular function that we set up earlier. Recall that captures the coverage to constraint if set is chosen. The residual requirement after choosing is . The residual requirement must be covered by elements from sets outside . The maximum contribution that can provide to this is . Hence the following constraint is valid for any :