# Submodular Maximization Through Barrier Functions

In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a k-matchoid and ℓ-knapsack constraint (for ℓ≤ k), we propose a potential function that can be approximately minimized. Once we minimize the potential function up to an ϵ error it is guaranteed that we have found a feasible set with a 2(k+1+ϵ)-approximation factor which can indeed be further improved to (k+1+ϵ) by an enumeration technique. We extensively evaluate the performance of our proposed algorithm over several real-world applications, including a movie recommendation system, summarization tasks for YouTube videos, Twitter feeds and Yelp business locations, and a set cover problem.

## Authors

• 3 publications
• 44 publications
• 14 publications
• 5 publications
• ### Streaming Submodular Maximization under a k-Set System Constraint

In this paper, we propose a novel framework that converts streaming algo...
02/09/2020 ∙ by Ran Haba, et al. ∙ 0

• ### Non-monotone Submodular Maximization in Exponentially Fewer Iterations

In this paper we consider parallelization for applications whose objecti...
07/30/2018 ∙ by Eric Balkanski, et al. ∙ 0

• ### Deletion-Robust Submodular Maximization at Scale

Can we efficiently extract useful information from a large user-generate...
11/20/2017 ∙ by Ehsan Kazemi, et al. ∙ 0

• ### Regularized Submodular Maximization at Scale

In this paper, we propose scalable methods for maximizing a regularized ...
02/10/2020 ∙ by Ehsan Kazemi, et al. ∙ 11

• ### Black Box Submodular Maximization: Discrete and Continuous Settings

In this paper, we consider the problem of black box continuous submodula...
01/28/2019 ∙ by Lin Chen, et al. ∙ 6

• ### Probably Approximately Correct Greedy Maximization

Submodular function maximization finds application in a variety of real-...
02/25/2016 ∙ by Yash Satsangi, et al. ∙ 0

• ### An Optimal Monotone Contention Resolution Scheme for Bipartite Matchings via a Polyhedral Viewpoint

Relaxation and rounding approaches became a standard and extremely versa...
05/21/2019 ∙ by Simon Bruggmann, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In the constrained continuous optimization, barrier functions are usually used to impose an increasingly large cost on a feasible point as it approaches the boundary of the feasible region [32]. In effect, barrier functions replace constraints by a penalizing term in the primal objective function so that the solution stays away from the boundary of the feasible region. This is an attempt to approximate a constrained optimization problem with an unconstrained one and to later apply standard optimization techniques. While the benefits of barrier functions are studied extensively in the continuous domain [32], their use in discrete optimization is not very well understood.

In this paper, we show how discrete barrier functions manifest themselves in constrained submodular maximization. Submodular functions formalize the intuitive diminishing returns condition, a property that not only allows optimization tractability but also appears in many machine learning applications, including video, image, and text summarization

[12, 35, 23, 29, 7], active set selection in non-parametric learning [26], sequential decision making [27, 28] sensor placement, information gathering [10], privacy and fairness [17]. Formally, for a ground set , a non-negative set function is submodular if for all sets and every element , we have

 f(A∪{e})−f(A)≥f(B∪{e})−f(B).

The submodular function is monotone if for all we have .

The celebrated results of Nemhauser et al. [30] and Fisher et al. [8] show that the vanilla greedy algorithm provides an optimal approximation guarantee for maximizing a monotone submodular function subject to a cardinality constraint. However, the performance of the greedy algorithm degrades as the feasibility constraint becomes more complex. For instance, the greedy algorithm does not provide any constant factor approximation guarantee if we replace the cardinality constraint with a knapsack constraint. Even though there exist many works that achieve the tight approximation guarantee for maximizing a monotone submodular function subject to multiple knapsack constraints, the running time of these algorithms is prohibitive as they either rely on enumerating large sets or running the continuous greedy algorithm. In contrast, we showcase a fundamentally new optimization technique through a discrete barrier function minimization in order to efficiently handle knapsack constraints and develop fast algorithms. More formally, we consider the following constrained submodular maximization problem defined over the ground set :

 S∗=argmaxS⊆\cN,S∈\cIci(S)≤1∀i∈[ℓ]f(S), (1)

where the constraint is the intersection of a -matchoid constraint (a general subclass of -set systems) and knapsacks constraints (for ).

##### Contributions.

We propose two algorithms for maximizing a monotone and submodular function subject to the intersection of a -matchoid and knapsack constraints. Our approach uses a novel barrier function technique and lies in between fast thresholding algorithms with suboptimal approximation ratios and slower algorithms that use continuous greedy and rounding methods. The first algorithm, Barrier-Greedy, obtains a -approximation ratio and runs in time, where is the maximum cardinality of a feasible solution. The second algorithm, Barrier-Greedy++, obtains a better approximation ratio of , but at the cost of running time. Our algorithms are theoretically fast and even exhibit better performance in practice while achieving a near-optimal approximation ratio. Indeed, the factor of matches the greedy algorithm for matroid constraints [8]. The only known improvement of this result requires a more sophisticated (and very slow) local-search algorithm [21]. Our results show that barrier function minimization techniques provide a versatile algorithmic tool for constrained submodular optimization with strong theoretical guarantees that may scale to many previously intractable problem instances. Finally, we demonstrate the effectiveness of our proposed algorithms over several real-world applications, including a movie recommendation system, summarization tasks for YouTube videos, Twitter feeds of news agencies and Yelp business locations, and a set cover problem.

##### Paper Structure.

In Section 3, we formally define the notation and the constraints we use. In Section 4, we describe our proposed barrier function. We then present our algorithms for maximizing a monotone submodular function subject to a -matchoid system and knapsack constraints. In Section 5

, built upon of theoretical results, we present a heuristic algorithm with a better performance in practice. In

Section 6, we describe the experiments we conducted to study the empirical performance of our algorithms.

## 2 Related Work

The problem of maximizing a monotone submodular function subject to various constraints goes back to the seminal work of Nemhauser et al. [30] and Fisher et al. [8] which showed that the greedy algorithm gives a -approximation subject to a cardinality constraint, and more generally a -approximation for any -system (which subsumes the intersection of matroids, and also the -matchoid constraint considered here). Nemhauser and Wolsey [31] also showed that the factor of

is best possible in this setting. After three decades, there was a resurgence of interest in this area due to new applications in economics, game theory and machine learning. While we cannot do justice to all the work that has been done in submodular maximization, let us mention the works most relevant to ours—in particular focusing on matroid/matchoid and knapsack constraints.

Sviridenko [34] gave the first algorithm to achieve a -approximation for submodular maximization subject to a knapsack constraint. This algorithm, while relatively simple, requires enumeration over all triples of elements and hence its running time is rather slow (). Vondrák [36] and Călinescu et al. [4] gave the first -approximation for submodular maximization subject to a matroid constraint. This algorithm, continuous greedy with pipage rounding, is also relatively slow (at least , depending on implementation). Using related techniques, Kulik et al. [19] gave a -approximation subject to any constant number of knapsack constraints, and Chekuri et al. [5] gave a -approximation subject to one matroid and any constant number of knapsack constraint; however, these algorithms are even slower and less practical.

Following these results (optimal in terms of approximation), applications in machine learning called for more attention being given to running time and practicality of the algorithms (as well as other aspects, such as online/streaming inputs and distributed/parallel implementations, which we do not focus on here). In terms of improved running times, Gupta et al. [11] developed fast algorithms for submodular maximization (motivated by the online setting), however with suboptimal approximation factors. Badanidiyuru and Vondrák [2] provided a -approximation subject to a cardinality constraint using value queries, and subject to a matroid constraint using queries. Also, they gave a fast thresholding algorithm providing a -approximation for a -system combined with knapsack constraints using queries. This was further generalized to the non-monotone setting by Mirzasoleiman et al. [25]. However, note that in these works the approximation factor deteriorates not only with the -system parameter (which is unavoidable) but also with the number of knapsack constraints .

## 3 Preliminaries and Notation

Let be a non-negative and monotone submodular function defined over ground set . Given an element and a set , we use as a shorthand for the union . We also denote the marginal gain of adding to a by . Similarly, the marginal gain of adding a set to another set is denoted by .

A set system is an independence system if and , implies that . In this regard, a set is called independent, and a set is called dependent. A matroid is an independence system with the following additional property: if and are two independent sets obeying , then there exists an element such that is independent.

In this paper, we consider two different constraints. The first constraint is in an intersection of matroids or a -matchoid (as a generalization of the intersection of -matroids). The second constraint is the set of knapsacks for . Next, we formally define these constraints.

###### Definition 1.

Let be arbitrary matroids over the common ground set . An intersection of matroids is an independent system such that .

###### Definition 2.

An independence set system is a -matchoid if there exist different matroids such that , each element appears in no more than ground sets among and .

A knapsack constraint is defined by a cost vector

for the ground set , where for the cost of a set we have . Given a knapsack capacity (or budget) , a set is said to satisfy the knapsack constraint if . We assume, without loss of generality, the capacity of all knapsacks are normalized to .

Assume there is a global ordering of elements . For a set and an element , the contribution of to (denoted by ) is the marginal gain of adding element to all elements of that are smaller than , i.e., . From the submodularity of , it is straightforward to show that . The benefit of adding to set (denoted by ) is the marginal gain of adding element to set , i.e., . Furthermore, for each element , represents the aggregate cost of over all knapsacks. It is easy to see that . We also denote the latter quantity, the aggregate cost of all elements of over all knapsack, by . Since we have knapsacks and the capacity of each knapsack is normalized to , for any feasible solution , we have always .

## 4 The Barrier Function and Our Algorithms

In this section, we first explain our proposed barrier function. We then present Barrier-Greedy and Barrier-Greedy++ and prove that these two algorithms, by efficiently finding a local minimum of the barrier function, can efficiently maximize a monotone submodular function subject to the intersection of -matroids and knapsacks. At the end of this section, we demonstrate how our algorithms could be extended to the case of -matchoid constraints.

### 4.1 The Barrier-Greedy Algorithm

Existing local search algorithms under matroid constraints try to maximize the objective function over a space of feasible swaps [20, 21]; however, our proposed method, a new local-search algorithm called Barrier-Greedy, avoids the exponential dependence on while it incorporates the additional knapsack constraints. Note that the knapsack constraints generally make the structure of feasible swaps even more complicated.

As a first technical contribution, instead of making the space of feasible swaps huge and more complicated, we incorporate the knapsack constraints into a potential function similar to barrier functions in the continuous optimization domain. For a set function and intersection of matroids and knapsack constraints , we propose the following potential function:

 ϕ(S)={{OPT}}−(k+1)⋅f(S)1−∑ℓi=1ci(S), (2)

where OPT is the optimum value for Problem (1). This potential function incorporates the knapsack constraints in a very conservative way: while for a feasible set could be as large as , we consider only sets with , whereas for sets with a larger weight the potential function becomes negative.111In Section 5, we propose a version of our algorithm that is more aggressive towards approaching the boundaries of knapsack constraints. We point out that the choice of our potential function works best for a combination of matroids and knapsacks. When the number of matroid and knapsack constraints is not equal, we can always add redundant constraints so that is the maximum of the two numbers. For this reason, in the rest of this paper, we assume .

In Barrier-Greedy, our main goal is to efficiently minimize the potential function in several consecutive sequential rounds. This potential function is designed in a way such that either the current solution respects all the knapsack constraints or if the solution violates any of the knapsack constraints, we can guarantee that the objective value is already sufficiently large. Note that the potential function involves the knowledge of OPT

—we replace this by an estimate that we can “guess" (enumerate over) efficiently by a standard technique.

As a second technical contribution, we optimize the local search procedure for matroids. More precisely, we improve the previously known running time of Lee et al. [20] to a new method with time complexity of . This is accomplished by a novel greedy approach that efficiently searches for the best existing swap, instead of a brute-force search among all possible swaps. With these two points in mind, we now proceed to explain our first proposed algorithm Barrier-Greedy, in detail.

In the running of Barrier-Greedy, we require an accurate enough estimate of the optimum value OPT that we denote by . Indeed, a technique first proposed by Badanidiyuru et al. [1] can be used to guess such a value: from the submodularity of , we can deduce that , where is the largest value in the set and is the maximum cardinality of a feasible solution. Then, it suffices to try different guesses in the set to obtain a close enough estimate of OPT. In the rest of this section, we assume that we have access to a value of such that . Using as an estimate of OPT, our potential function converts to

 ϕ(S)=Ω−(k+1)⋅f(S)1−γ(S).

To quantify the effect of each element on the potential function , as a notion of their individual energy, we define the following quantity:

 δa=(k+1)⋅(1−γ(S))⋅wa−(Ω−(k+1)⋅f(S))⋅γa. (3)

The quantity measures how desirable an element is with respect to the current solution , i.e., larger values of would have a larger effect on the potential function. Also, any element with can be removed from the solution without increasing the potential function (see Lemma 4).

The Barrier-Greedy algorithm starts with an empty set and performs the following steps for at most iterations or till it reaches a solution such that : Firstly, it finds an element with the maximum value of such that for and . Barrier-Greedy computes values of from Eq. 3. Note that, in this step, we need to compute for all elements only once and store them; then we can use these pre-computed values to find the best candidate . The goal of this step is to find an element such that its addition to set and removal of a corresponding set of elements from decrease the potential function by a large margin while still keeping the solution feasible. In the second step, Barrier-Greedy removes all elements with form set . In Lemma 4, we prove that these removals could only decrease the potential function. The Barrier-Greedy algorithm produces a solution with a good objective value mainly for two reasons:

• if it continues for iterations, we can prove that the potential function would be very close to , which consequently enables us to guarantee the performance for this case. Note that, for our solution, we maintain the invariant that to make sure the knapsack constraints are also satisfied.

• if , we would prove that the objective value of one the two feasible sets and is at least , where is the last added element to .

The details of Barrier-Greedy are described in Algorithm 1. Theorem 3 guarantees the performance of Barrier-Greedy.

###### Theorem 3.

Barrier-Greedy (Algorithm 1) provides a -approximation for the problem of maximizing a monotone submodular function subject to the intersection of matroids and knapsack constraints (for ). It also runs in time , where is the maximum cardinality of a feasible solution.

###### Proof.

We first prove that removing elements with could only decrease the potential function .

###### Lemma 4.

Suppose that is a current solution such that and is such that . Then if we define , we obtain a solution such that and .

###### Proof.

First note that by removing an element, the total cost of knapsacks can only decrease, so we still have , as cost of elements is non-negative in all knapsacks. Consider the change in the potential function:

 ϕ(S′)−ϕ(S) =Ω−(k+1)⋅f(S′)1−γ(S′)−Ω−(k+1)⋅f(S)1−γ(S) (From Eq. 2) =((Ω−(k+1)⋅f(S′))⋅(1−γ(S))−(Ω−(k+1)⋅f(S))⋅(1−γ(S′)))(1−γ(S))⋅(1−γ(S′)) (4)

By submodularity of function , we have , as for , we have . Also, from the linearity of knapsack costs, we have . Therefore, by applying and to the right side of Section 4.1, we get:

 ϕ(S′)−ϕ(S) ≤((Ω−(k+1)⋅f(S)+(k+1)⋅wa)⋅(1−γ(S))−(Ω−(k+1)⋅f(S))⋅(1−γ(S)+γa))(1−γ(S))⋅(1−γ(S′)) =((k+1)⋅wa⋅(1−γ(S))−(Ω−(k+1)⋅f(S))γa)(1−γ(S))⋅(1−γ(S′)) =δa(1−γ(S))⋅(1−γ(S′))≤0. (δa≤0 and γ(S)≤1)

After removing all elements with , we obtain a new solution such that for all . In the next step, we require to include a new element in order to decrease the potential function the most. The following lemma provides an algorithmic procedure to achieve this goal. Recall that we denote the -th matroid constraint by .

###### Lemma 5.

Assume , and is the current solution such that , , and . Assume that for each , . Given , let , and for each . Then there is such that

 δb−∑i∈Jbδai(b)≥1|S∗|⋅(1−γ(S))⋅(Ω−(k+1)⋅f(S)).
###### Proof.

To prove this lemma, we first state the following well-known result for exchange properties of matroids.

###### Lemma 6 ([33], Corollary 39.12a).

Let be a matroid and let with . Then there is a perfect matching between and such that for every , the set is an independent set.

Let be an optimal solution with . Let us assume that are bases of containing and , respectively. By Lemma 6, there is a perfect matching between and such that for any , . For each and (defined as above, denotes the matroids in which we cannot add without removing something from ), let denote the endpoint in of the edge matching in . This means that .

Since for each , we pick to be an element of minimizing subject to the condition , and is a possible candidate for , we have . Consequently, it is sufficient to bound to prove the lemma.

Since each is matched exactly once in each matching , we obtain that each appears as at most times for different and . Note that it could appear less than times due to the fact that it might be matched to elements in . Let us define for each to contain plus some arbitrary additional elements of , so that each element of appears in exactly sets . Since for all , we have

 δb−∑a∈Tbδa≤δb−∑i∈Jbδπi(b)≤δb−∑i∈Jbδai(b).

Hence it is sufficient to prove that for some . Let us choose a random and compute the expectation . First, since each element of

is chosen with probability

, we obtain

 \bf E[wb]=∑b∈S∗wb|S∗|=∑b∈S∗fS(b)|S∗|≥fS(S∗)|S∗|≥(Ω−f(S))|S∗|,

by submodularity. Similarly, since is a feasible solution, we have

 \bf E[γb]=1|S∗|∑b∈S∗γb≤k|S∗|.

Concerning the contribution of the items in , we obtain,

 \bf E[∑a∈Tbwa]=1|S∗|∑b∈S∗∑a∈Tbwa=k|S∗|∑a∈Swa=k|S∗|⋅f(S),

using the fact that each appears in exactly sets . Similarly,

 \bf E[∑a∈Tbγa]=1|S∗|∑b∈S∗∑a∈Tbγa=k|S∗|⋅γ(S).

All together, we obtain

 \bf E[δb−∑a∈Tbδa] =\bf E⎡⎣(k+1)⋅(1−γ(S))⋅(wb−∑a∈Tbwa)−(Ω−(k+1)⋅f(S))⋅(γb−∑a∈Tbγai)⎤⎦ ≥k+1|S∗|⋅(1−γ(S))⋅(Ω−f(S)−k⋅f(S))−1|S∗|⋅(Ω−(k+1)⋅f(S))⋅(k−k⋅γ(S)) =1|S∗|⋅(1−γ(S))⋅(Ω−(k+1)⋅f(S)).

Since the expectation is at least , there must exist an element for which the expression is at least the same amount, which proves the lemma. ∎

Now, we bound the maximum required number of iterations to converge to a solution whose value is sufficiently high. Let and for the optimal solution . In Algorithm 1, we start from and repeat the following: As long as for some , we remove from . If there is no such , we find such that (see Lemma 5); we include element in and remove set from .

###### Lemma 7.

Barrier-Greedy, after at most iterations, returns a set such that . Furthermore, at least one of the two sets or is feasible, where is the last element added to .

###### Proof.

At the beginning of the process, we have . Our goal is to show that decreases sufficiently fast, while we keep the invariant .

We know that, from the result of Lemma 4, removing elements with can only decrease the value of . We ignore the possible gain from these steps. When we include a new element and remove from , we get from Lemma 5:

 δb−∑i∈Jbδai(b)≥1|S∗|⋅(1−γ(S))⋅(Ω−(k+1)⋅f(S)).

Next, let us relate this to the change in . We denote the modified set by . First, by submodularity and the definition of , we know that

 f(S′)≥f(S)+wb−∑i∈Jbwai(b).

We also have

 γ(S′)=γ(S)+γb−∑i∈Jbγai(b).

First, let us consider what happens when . This means that . Since we know that , this means (by the definitions of and ) that

 (k+1)⋅(wb−∑i∈Jbwai(b))≥Ω−(k+1)⋅f(S).

In other words, . Note that might be infeasible, but is feasible (since was feasible), so in this case we are done.

In the following, we assume that . Then the potential change is

 ϕ(S′)−ϕ(S) ≤((Ω−(k+1)⋅(f(S)+wb−∑i∈Jbwai(b)))⋅(1−γ(S))(1−γ(S))⋅(1−γ(S′)) −(Ω−(k+1)⋅f(S))⋅(1−γ(S)−γb+∑i∈Jbγai(b)))(1−γ(S))⋅(1−γ(S′))) =((k+1)⋅(−wb+∑i∈Jbwai(b))⋅(1−γ(S))−(Ω−(k+1)⋅f(S))⋅(−γb+∑i∈Jbγai(b)))(1−γ(S))⋅(1−γ(S′)) =(−δb+∑i∈Jbδai(b))(1−γ(S))⋅(1−γ(S′))≤−1|S∗|Ω−(k+1)⋅f(S)1−γ(S′) =−1r1−γ(S)1−γ(S′)ϕ(S),

using Lemma 5. We infer that

 ϕ(S′)≤(1−1r⋅1−γ(S)1−γ(S′))ϕ(S).

By induction, if we denote by the solution after iterations,

 ϕ(St) ≤t∏i=1(1−1r⋅1−γ(Si−1)1−γ(Si))ϕ(S0)≤e−1r∑ti=11−γ(Si−1)1−γ(Si)ϕ(S0).

Here, we use the arithmetic-geometric-mean inequality:

 1tt∑i=11−γ(Si−1)1−γ(Si)≥(t∏i=11−γ(Si−1)1−γ(Si))1/t=(1−γ(S0)1−γ(St))1/r≥1.

Therefore, we can upper bound the potential function at the iteration :

 ϕ(St)≤e−tr⋅1t∑ti=11−γ(Si−1)1−γ(Si)ϕ(S0)≤e−trϕ(S0)=e−trΩ.

For , we obtain (and ), which implies . ∎

Now, we have all the required material to prove Theorem 3.

##### Proof of Theorem 3

The for loop for estimating OPT is repeated times. Consider the value of such that . We perform the local search procedure: In each iteration, we check all possible candidates and find the best swap for each matroid where a swap is needed (the set of indices ). This requires checking the membership oracles for and the values for each potential swap. This takes steps. Note that assume to be a constant, but generally, it contributes only to the multiplicative constant rather than the degree of the polynomial. Finally, we choose the elements and so that is maximized. Due to Lemma 5, the best swap satisfies . Following this swap, we need to recompute the values of for and remove all elements with . Considering Lemma 7, this is sufficient to prove that we terminate within iterations of the local search procedure. Therefore, the algorithm terminates within running time . In the end, we have a set such that (as the result of Lemma 7). It is possible that is infeasible, but both and are feasible (where is the last-added element), and by submodularity one of them has an objective value of at least . ∎

### 4.2 The Barrier-Greedy++ Algorithm

In this section, we use an enumeration technique to improve the approximation factor of Barrier-Greedy to . For this reason, we propose the following modified algorithm: for each feasible pair of elements , define a reduced instance where the objective function is replaced by a monotone and submodular function , and the knapsack capacities are decreased by . In this reduced instance, we remove the two elements and all elements with from the ground set . Recall that the contraction of a matroid to a set is defined by a matroid such that . In the reduced instance, we consider contractions of all the matroids to set as the new set of matroid constraints. Note that elements with are also removed from the ground set of these contracted matroids. Then, to obtain a solution , we run Algorithm 1 on the reduced instance. Finally, we return the best solution of over all feasible pairs . Here, by construction, we are sure that all the solutions are feasible in the original set of constraints. Note that, for the final solution, if there is no feasible pair of elements, we just return the most valuable singleton. The details of our algorithm (called Barrier-Greedy++) are described in Algorithm 2. Theorem 8 guarantees the performance of Barrier-Greedy.