# Greed is Not Always Good: On Submodular Maximization over Independence Systems

In this work, we consider the maximization of submodular functions constrained by independence systems. Because of the wide applicability of submodular functions, this problem has been extensively studied in the literature. When the independence system is a p-system, prior literature has claimed that the greedy algorithm achieves a 1/(p+1)-approximation if the submodular function is monotone. We show that, on the contrary, for any ϵ > 0, the problem is hard to approximate within (2/n)^1-ϵ, where n is the size of the ground set, even when the independence system is a 1-system. This result invalidates prior work on constant-factor algorithms for non-monotone submodular maximization over p-systems as well. On the positive side, we provide the first nearly linear-time algorithm for maximization of non-monotone submodular functions over p-extendible independence systems, which are a subclass of p-systems.

## Authors

• 13 publications
06/05/2019

### A Note on Submodular Maximization over Independence Systems

In this work, we consider the maximization of submodular functions const...
11/13/2018

### Greedy Maximization of Functions with Bounded Curvature under Partition Matroid Constraints

We investigate the performance of a deterministic GREEDY algorithm for t...
09/29/2020

### Simultaneous Greedys: A Swiss Army Knife for Constrained Submodular Maximization

In this paper, we present SimultaneousGreedys, a deterministic algorithm...
07/06/2021

### Submodular Order Functions and Assortment Optimization

We define a new class of set functions that in addition to being monoton...
05/02/2019

### Budget-Feasible Mechanism Design for Non-Monotone Submodular Objectives: Offline and Online

The framework of budget-feasible mechanism design studies procurement au...
04/11/2013

### Scaling the Indian Buffet Process via Submodular Maximization

Inference for latent feature models is inherently difficult as the infer...
03/20/2019

### Distributed Maximization of "Submodular plus Diversity" Functions for Multi-label Feature Selection on Huge Datasets

There are many problems in machine learning and data mining which are eq...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Submodularity111A function is submodular if for every , , . captures an important diminishing-returns property of discrete functions. Submodular set functions arise from e.g. viral marketing (Kempe et al., 2003), data summarization (Mirzasoleiman and Krause, 2015), and sensor placement (Krause et al., 2008). The optimization of these functions has been studied subject to various types of independence system222An independence system on the set is a collection of subsets of such that (i) is nonempty, and (ii) if and , then . constraints, including cardinality (Nemhauser et al., 1978), matroid (Fisher et al., 1978), and the more general independence systems (Calinescu et al., 2011). Formally, the problem (MAXI) considered in this work is the following: given submodular function and independence system on , determine

 argmaxS∈If(S).

Even on an independence system where maximal independent sets have the same size, the greedy algorithm may return arbitrarily bad solutions for MAXI. Our results indicate that some exchange property between independent sets must exist if the problem is to be tractable.

### Contributions

Our main contributions are summarized as follows.

• Let denote the subclass of independences systems where maximal independent sets have the same size. We show that admits no polynomial-time algorithm with approximation ratio better than unless NP = ZPP, even when the submodular function is restricted to be monotone; here, is the size of the ground set, and is arbitrary. On the other hand, under the condition that the system has two disjoint bases, the greedy algorithm does obtain a ratio of . Intuitively, the difficulty of approximation on a -system arises from the lack of any exchange property between the independent sets.

• Also, we provide a deterministic algorithm TripleGreedy (Alg. 2), which has the ratio on -extendible systems in function evaluations, when the objective function is submodular but not necessarily monotone. This is the first approximation algorithm on -extendible systems whose runtime is linear up to a logarithmic factor in the size of the ground set and is independent of both and the the maximum size of any independent set. In prior literature, the fastest randomized algorithm is that of Feldman et al. (2017), which achieves expected ratio in evaluations, while the fastest deterministic algorithm is also by Feldman et al. (2017) and achieves ratio in evaluations.

### Related work

The maximization of monotone, submodular functions over independence systems has a long history of study; Fisher et al. (1978) proved the approximation ratio of for the greedy algorithm when the independence system is an intersection of matroid constraints, which is a special case of a -extendible system. This ratio for the greedy algorithm was extended to -extendible systems by Calinescu et al. (2011), as well as to the more general -system constraint. A similar ratio for a faster, thresholded greedy algorithm and -system constraint was also given by Badanidiyuru and Vondrák (2014).

For the special case when the independence system is a single matroid or cardinality constraint, better approximation guarantess have been obtained: in Calinescu et al. (2011), an optimal -approximation is given when is monotone and the independence system is a matroid. For further information, the reader is referred to the survey of Buchbinder and Feldman (2018b) and references therein.

When is non-monotone and the independence system is a -extendible system, Gupta et al. (2010) provided an -approximation in function evaluations; this was improved by Mirzasoleiman et al. (2016) to with the same time complexity, and Feldman et al. (2017) improved this to a ratio of in evaluations. Furthermore, Mirzasoleiman et al. (2018) extended these works to a streaming setting. All of these works rely upon an iterated greedy approach, which employs up to iterations of the standard greedy algorithm. In Section 5, we propose a simpler iterated greedy approach for -extendible systems, which relies upon only two iterations of the greedy algorithm. We show how to speed up this algorithm to obtain ratio in evaluations.

### Organization

The rest of this paper is organized as follows: in Section 2 we define notions used throughout the paper. In Section 3 we prove the hardness result for . Next, we show that the greedy algorithm is indeed the optimal approximation on under a weak assumption in Section 4. Finally, in Section 5 we provide our nearly linear-time for submodular maximization over a -extendible system.

## 2 Preliminaries

Throughout the paper, denotes the ground set of size . In this work, the objective function is a non-negative function ; typically, the function is given as an oracle that returns, for given set , the value . Our inapproximability result in Section 3 holds in this model, but it also holds when a description of as a polynomial-time computable function is given as input. When is a set and , we occasionally write for .

The members of an independence system are termed independent sets. An independent set is a basis of independence system if for all , .

###### Definition (Matroid).

An independence system is a matroid if the following property holds: if and , then there exists such that .

###### Definition (p-Extendible System).

An independence system is -extendible if the following property holds. If , with and if such that , then there exists subset with such that .

###### Definition (p-System).

A -system is an independence system such that if are bases, then .

We remark that every -extendible system is also a -system, but that the converse is not true, as the exchange property defining a -extendible system may not hold. Furthermore, every matroid is a -system, but the converse does not hold. As an example, let , , and . Then is clearly a -system but not a matroid.

## 3 Hardness of Submodular Maximization over Independence Systems

In this section, the main inapproximability result is proven for : maximization of submodular functions over independence systems for which all maximal bases have equal size.

Hardness of is established via an approximation-preserving reduction to the independent set problem (ISG) in a graph, which is to find the maximum size of an edge-independent set of vertices. Once this reduction is defined, we show that any -approximation for yields an -approximation for ISG, and our hardness result follows from the hardness of ISG.

###### Definition (Isg).

The ISG problem is the following: given a finite graph , where , define a set to be edge-independent iff no pair of vertices in have an edge between them. Then the ISG problem is to determine the maximum size of an edge-independent set in .

It is easily seen that the set is edge-independent in is an independence system. In general, may be a -system, where ; consider a star graph where all vertices are connected to a center vertex and no other edges exist.

Intuitively, the reduction works by transforming a graph, which is an instance of ISG, into an instance of

through the padding of edge-independent sets with dummy elements so that maximal independent sets have the same size. A submodular function is then defined that maps the padded independent sets to the size of the original, unpadded, edge-independent set in the graph. Formally, the reduction is defined as follows.

###### Definition (Reduction Φ).

Let be a graph, which is an instance of ISG. Let , where is a set of dummy elements. An independence system is defined on as follows: is in iff. is edge-independent in and . Define function , by .

We remark that the function is defined on all subsets of , not only members of the independence system. To illustrate the reduction, we provide the following example.

###### Example 1.

Let be a star graph with five vertices. That is, and . Then the maximal, edge-independent sets are and . Then maps this graph to the following independence system. The ground set , where is a set of five dummy elements. Then the independence system defined by has bases

 B={{a,b,c,d,e}:e∈D}∪{{s,e1,e2,e3,e4}:ei∈D,1≤i≤4}.

That is, consists of all subsets of elements of .

By the following lemma, the reduction takes an instance of ISG to an instance of . Notice that the independence of any subset of may be checked in polynomial time; the same is true for computation of .

###### Lemma 1.

Let be an instance of ISG, and let . Then

• is an independence system; in particular, all maximal bases have equal size.

• is monotone and submodular.

###### Proof.

(i): Clearly, is non-empty, since any singleton vertex is edge-independent in , and . Furthermore, it is closed under subsets: let , where , , and let . Then , where , . Since any subset of an edge-independent set of is also edge-independent, we have that is edge-independent in , and

 |T∩D|=|^B|≤|B|≤n−|A|≤n−|^A|=n−|T∩V|.

Hence . Thus, is an independence system on .

Next, suppose is maximal. Then , for otherwise another dummy element could be added to to produce a larger independent set. Hence is a -system.

(ii): Let ; notice that are not necessarily in the independence system . Then , so the function is monotone.

Next, let . If , then

 f(S∪{x})−f(S)=f(T∪{x})−f(T)=1.

If ,

 f(S∪{x})−f(S)=f(T∪{x})−f(T)=0.

Hence, in all cases, , so the function is submodular. ∎

Next, we show that is an approximation-preserving reduction.

###### Lemma 2.

By application of the reduction , any -approximation algorithm to yields an -approximation to ISG.

###### Proof.

Let be an instance of ISG, and let . Let . Since membership of a set requires that be edge-independent in , we have that , where is the maximum size of an edge-independent set of . Now suppose set satisfies . Then

 αOPTG=αOPTU≤f(X)=|X∩V|,

and by definition of , is edge-independent in . Therefore, any approximation algorithm for with ratio yields an approximation algorithm for ISG with ratio by the following method: given instance of ISG, transform to an instance of . Apply the -approximation to get set such that . Finally, project back to and return the edge-independent set , which satisfies . ∎

The next theorem follows from Lemma 2 and the results of Hastad (1999) on ISG: namely, for any , there is no polynomial-time algorithm to approximate ISG better than unless NP = ZPP.

###### Theorem 1.

For any , there is no polynomial-time algorithm that achieves ratio better than on , where is the ground set of the instance of , unless NP = ZPP.

###### Proof.

For any , the universe of has ; by Lemma 2 and the result of Hastad (1999), the theorem follows. ∎

## 4 The Greedy Ratio on Maxi, when f is monotone

When the function is monotone, we further analyze the performance of the greedy algorithm (Alg. 1) on independence systems in this section. When all maximal bases have equal size, we show that the greedy algorithm obtains a ratio that matches our lower bound in the previous section.

We begin with a performance ratio for the greedy algorithm on an arbitrary independence system in terms of the size of the largest independent set.

###### Proposition 1.

Let be an independence system, and let . Let be the solution returned by the greedy algorithm, and let be the optimal solution to MAXI. Then .

###### Proof.

Let be the ground set of , and let , and observe that . Now let ; then by submodularity, . It follows that . ∎

The next corollary, combined with the hardness result from the previous section, shows that if the independence system has two disjoint bases, the greedy algorithm is the optimal approximation on systems where bases have equal size.

###### Corollary 1.

Let be a system where maximal bases have equal size, with at least two disjoint bases. Then the greedy algorithm is a -approximation algorithm to on .

###### Proof.

Let be bases of , such that . Since is a -system, for some , ; hence . Hence, , so the result follows from Prop. 1. ∎

## 5 The TripleGreedy Algorithm

In this section, the TripleGreedy (TG, Algorithm 2) is presented. The algorithm TG is the first nearly linear-time algorithm to approximately maximize a submodular function with respect to a -extendible system.

###### Definition (Max-Union).

Given and independence system , determine , such that for any , . Even if no such exists, by an -approximation to MAX-UNION, it is meant an algorithm that finds , such that for any , .

Notice that in the requirement of MAX-UNION may not be a member of the independence system.

The TG algorithm employs two subroutines, one to approximate the MAX-UNION problem and one for the unconstrained maximization problem; the unconstrained maximization problem is to determine . Since a total of three calls to these subroutines are required, and since variants of greedy algorithms may be used for each subroutine, Alg. 2 is termed TripleGreedy. First, TG determines a set approximating MAX-UNION with the function ; second, TG determines a set is found approximating MAX-UNION with the restriction of to . Third, a set is found, approximating the maximum value of restricted to . Finally, the set in maximizing is returned.

We remark that TG functions similarly to the algorithm for maximizing submodular functions with respect to cardinality constraint developed in Gupta et al. (2010); in place of MAX-UNION, Gupta et al. (2010) simply uses the greedy algorithm. By abstracting out this subproblem, we see that 1) a performance ratio may be proved in a much more general setting than cardinality constraint, namely for -extendible systems, and 2) the faster thresholding approach developed by Badanidiyuru and Vondrák (2014) (THRESHOLD) for monotone submodular maximization can be used for MAX-UNION, which results in nearly linear runtime.

If is submodular, then the approximation ratio of TG depends on the ratios of the algorithms used for MAX-UNION and UNCONSTRAINED-MAX.

###### Theorem 2.

Let be submodular, let be an independence system, and let , and let TG . Then

 f(C)≥(αβα+2β)f(O).

where and are the ratios of the algorithms used for UNCONSTRAINED-MAX, and MAX-UNION, respectively.

###### Proof.

Let have their values at termination of TG . Suppose a -approximation algorithm is used for UNCONSTRAINED-MAX. Then any set satisfies . Suppose an -approximation algorithm is used for MAX-UNION; so and .

 f(O)≤f(∅)+f(O) ≤f(O∩A)+f(O∖A) ≤β−1f(A′)+f(O∪A)+f((O∖A)∪B) ≤β−1f(A′)+α−1f(A)+α−1f(B) ≤(β−1+2α−1)f(C),

where the second and third inequalities follow from the submodularity of and the fact that is non-negative and . ∎

Next, we establish that THRESHOLD approximates MAX-UNION on -extendible systems; the proof is provided in Appendix A.

###### Lemma 3.

When is a -extendible system, the THRESHOLD algorithm (Alg. 3) of Badanidiyuru and Vondrák (2014) is a -approximation for MAX-UNION.

Finally, by Theorem 2 and Lemma 3 we have the ratio in nearly linear time on -extendible systems.

###### Corollary 2.

Let . If the deterministic approximation of Buchbinder and Feldman (2018a) is used for UNCONSTRAINED-MAX, and THRESHOLD of Badanidiyuru and Vondrák (2014) is used for MAX-UNION with ratio , the ratio of TG is with queries to and to the independence system.

## References

• Badanidiyuru and Vondrák (2014) Ashwinkumar Badanidiyuru and J Vondrák. Fast algorithms for maximizing submodular functions. Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1497–1514, 2014.
• Buchbinder and Feldman (2018a) Niv Buchbinder and Moran Feldman. Deterministic Algorithms for Submodular Maximization. ACM Transactions on Algorithms, 14(3), 2018a.
• Buchbinder and Feldman (2018b) Niv Buchbinder and Moran Feldman. Submodular Functions Maximization Problems – A Survey. In Teofilo F. Gonzalez, editor, Handbook of Approximation Algorithms and Metaheuristics. Second edition, 2018b.
• Calinescu et al. (2011) Gruia Calinescu, Chandra Chekuri, Martin Pal, and Jan Vondrák. Maximizing a Monotone Submodular Function Subject to a Matroid Constraint. SIAM Journal on Computing, 40(6), 2011.
• Feldman et al. (2017) Moran Feldman, Christopher Harshaw, and Amin Karbasi. Greed is Good: Near-Optimal Submodular Maximization via Greedy Optimization. In COLT, pages 1–26, 2017.
• Fisher et al. (1978) M.L. Fisher, G.L. Nemhauser, and L.A. Wolsey. An analysis of approximations for maximizing submodular set functions-II. Mathematical Programming, 8:73–87, 1978.
• Gupta et al. (2010) Anupam Gupta, Aaron Roth, Grant Schoenebeck, and Kunal Talwar. Constrained non-monotone submodular maximization: Offline and secretary algorithms. In WINE, volume 6484 LNCS, pages 246–257, 2010.
• Hastad (1999) Johan Hastad. Clique is hard to approximate within n^{1-}. Acta Mathematica, 182:105–142, 1999.
• Jenkyns (1976) T. A. Jenkyns. The efficacy of the "greedy" algorithm. In Proceedings of the 7th Southeastern Conference on Combinatorics, Graph Theory and Computing, 1976.
• Kempe et al. (2003) David Kempe, Jon Kleinberg, and Éva Tardos. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 137–146, 2003.
• Krause et al. (2008) Andreas Krause, Jure Leskovec, Carlos Guestrin, Jeanne M. VanBriesen, and Christos Faloutsos. Efficient sensor placement optimization for securing large water distribution networks. Journal of Water Resources Planning and Management, 134(6):516–526, 2008.
• Mirzasoleiman and Krause (2015) Baharan Mirzasoleiman and Andreas Krause. Distributed Submodular Cover : Succinctly Summarizing Massive Data. In NeurIPS, 2015.
• Mirzasoleiman et al. (2016) Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, and Amin Karbasi. Fast Constrained Submodular Maximization : Personalized Data Summarization. In ICML, 2016.
• Mirzasoleiman et al. (2018) Baharan Mirzasoleiman, Stefanie Jegelka, and Andreas Krause. Streaming Non-Monotone Submodular Maximization: Personalized Video Summarization on the Fly. In AAAI, pages 1379–1386, 2018.
• Nemhauser et al. (1978) G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular set functions-I. Mathematical Programming, 14(1):265–294, 1978.

## Appendix A Appendix

###### Proof of Lemma 3.

Let be returned by THRESHOLD. Let , . The set will be partitioned into at most subsets , each of size at most , as follows. Let , . Suppose have been obtained, such that , which is initially satisfied at . By the definition of -extendible system, there exists , with , such that . Then let and let ; clearly . If , stop; otherwise, continue inductively until . Let be the index at which this procedure terminates. If , let and redefine for all .

###### Claim 1.

For each , , for all .

###### Proof.

Since , and , the claim follows by definition of independence system. ∎

###### Claim 2.
 f(O∪A)−f(O0∪A)≤εM.
###### Proof.
 f(O∪A)−f(O0∪A) =f(O0∪Rj∪A)−f(O0∪A) ≤∑r∈Rjf(O0∪A∪{r})−f(O0∪A) ≤∑r∈Rjf(A∪{r})−f(A)≤εM,

where the last inequality is by the stopping condition of THRESHOLD and the fact that , so for all . The other inequalities follow from submodularity and the definition of . ∎

Then

 f(O∪A)−f(A) ≤f(O0∪A)−f(A)+εM =j−1∑i=0f(Oi∪A)−f(Oi+1∪A)+εM =j−1∑i=0f(Oi+1∪A∪Yi)−f(Oi+1∪A)+εM ≤j−1∑i=0∑y∈Yif(Oi+1∪A∪{y})−f(Oi+1∪A)+εM ≤j−1∑i=0∑y∈Yif(Ai∪{y})−f(Ai)+εM ≤j−1∑i=0p1−ε⋅(f(Ai∪{ai})−f(Ai))+εM≤p1−εf(A)+εM,

where the first inequality is by Claim 2, the first two equalities are by telescoping and the definition of , the second and third inequalities are by submodularity. The fourth inequality holds by the following argument: when was added to , it holds that the threshold has its initial value , in which case for any , or all were not added during the previous threshold . Hence by submodularity. Since , the lemma follows. ∎