Linear-Time Algorithms for Adaptive Submodular Maximization

07/08/2020 ∙ by Shaojie Tang, et al. ∙ 0

In this paper, we develop fast algorithms for two stochastic submodular maximization problems. We start with the well-studied adaptive submodular maximization problem subject to a cardinality constraint. We develop the first linear-time algorithm which achieves a (1-1/e-ϵ) approximation ratio. Notably, the time complexity of our algorithm is O(nlog1/ϵ) (number of function evaluations) which is independent of the cardinality constraint, where n is the size of the ground set. Then we introduce the concept of fully adaptive submodularity, and develop a linear-time algorithm for maximizing a fully adaptive submoudular function subject to a partition matroid constraint. We show that our algorithm achieves a 1-1/e-ϵ/4-2/e-2ϵ approximation ratio using only O(nlog1/ϵ) number of function evaluations.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Submodular maximization is a well-studied topic due to its applications in a wide range of domains including active learning

(Golovin and Krause 2011b), virtual marketing (Tang and Yuan 2020, Yuan and Tang 2017b), sensor placement (Krause and Guestrin 2007). Most of existing studies focus on the non-adaptive setting where one must select a group of items all at once subject to some practical constraints such as a cardinality constraint. Nemhauser et al. (1978) show that a classic greedy algorithm, which iteratively selects the item that has the largest marginal utility on top of the previously selected items, achieves approximation ratio when maximizing a monotone submodular function subject to a cardinality constraint. However, this algorithm needs function evolutions in order to select all items, where is the size of the ground set and is the cardinality constraint. Only recently, Mirzasoleiman et al. (2015) improves this time complexity (number of function evaluations) to by proposing a random sampling based stochastic greedy algorithm. Their algorithm can achieve an approximation ratio . Their basic idea can be roughly described as follows: At each round, it draws a small random sample of items, and selects the item, from the random sample, that has the largest marginal utility. They show that if they set the size of the random sample to , their algorithm achieves a near-optimal approximation ratio in linear-time. In general, developing fast algorithms for submodular maximization has received consideration attention during the past several years (Leskovec et al. 2007, Badanidiyuru and Vondrák 2014, Mirzasoleiman et al. 2016, Ene and Nguyen 2018).

Recently, Golovin and Krause (2011b) extends the previous study to the adaptive setting. They propose the problem of adaptive submodular maximization, a natural stochastic variant of the classical non-adaptive submodular maximization problem. They assume that each item is in a particular state drawn from a known prior distribution. The only way to reveal an item’s state is to select that item. Note that the decision on selecting an item is selecting an item is irrevocable, that is, we can not discard any item that is previously selected. Their goal is to adaptively select a group of items so as to maximize the expected utility of an adaptive submodular and adaptive monotone function. There have been numerous research studies on adaptive submodular maximization under different settings (Chen and Krause 2013, Tang and Yuan 2020, Tang 2020, Yuan and Tang 2017a, Fujii and Sakaue 2019). Our first result focuses on adaptive submodular maximization subject to a cardinality constraint. The state-of-the-art for this setting is a simple adaptive greedy policy (Golovin and Krause 2011b) which achieves approximation ratio. Similar to the classic non-adaptive greedy algorithm (Nemhauser et al. 1978), their adaptive greedy policy needs function evolutions in order to select all items. One might ask whether is it possible to have a linear-time algorithm for the adaptive setting? In this paper, we answer this question affirmatively and propose the first adaptive policy that achieves approximation ratio using only function evaluations. We generalize the non-adaptive stochastic greedy algorithm proposed in (Mirzasoleiman et al. 2015) to the adaptive setting, and show that a similar random sampling based technique can be used to evaluate the marginal utility of an item in each round under the adaptive setting. In the second part of this paper, we study a more general optimization problem subject to a partition matroid constraint. To make our problem approachable, we introduce a class of stochastic functions, called fully adaptive submodular functions. Then we develop a generalized random sampling based adaptive policy for maximizing a fully adaptive submoudular function subject to a partition matroid constraint. We show that our algorithm achieves a approximation ratio using only function evaluations.

Remark: The other line of work focuses on developing fast parallel algorithms for non-adaptive submodular maximization (Balkanski and Singer 2018, Chekuri and Quanrud 2019)

. They developed approximation algorithms for submodular maximization whose parallel runtime is logarithmic. All algorithms with logarithmic runtime require a good estimate of the optimal value. However, the optimal value is unknown initially, one common approach to resolving this issue is to run multiple instances of algorithms in parallel using different guesses of the optimal value, which ensures that one of those guesses is a good approximation to the optimal value, and then returning the solution with highest value. Unfortunately, our adaptive setting can not afford this type of “parallelization” in general. Recall that under the adaptive setting, the decision on selecting an item is irrevocable. When running multiple instances of algorithms under the adaptive setting, we must keep all solutions returned from those instances. As a result, the existing studies on parallelization are not applicable to the adaptive setting.

2 Preliminaries

We start by introducing some important notations. In the rest of this paper, we use to denote the set , and we use to denote the cardinality of a set .

2.1 Items and States

We are given a set of items, and each item has a particular state belonging to . Let denote a realization of the states of items. We use to denote a random realization, where is a random realization of . The only way to reveal the actual state of an item

is to select that item. We assume there is a known prior probability distribution

over realizations . A typical adaptive policy works as follows: select the first item and observe its state, then continue to select the next item based on the current observation, and so on. After each selection, we observe some partial realization of the states of some subset of , for example, we are able to observe the partial realization of the states of those items which have been selected. We define the domain of as the subset of items involved in . For any realization and any partial realization , we say is consistent with , denoted , if they are equal everywhere in the domain of . We say that is a subrealization of , denoted , if and they are equal everywhere in . Let denote the conditional distribution over realizations conditioned on a partial realization : . There is a utility function from a subset of items and their states to a non-negative real number: .

2.2 Policies and Problem Formulation

Formally, any adaptive policy can be represented using a function from a set of partial realizations to , specifying which item to select next based on the current observations.

[Policy Concatenation] Given two policies and , let denote a policy that runs first, and then runs , ignoring the observation obtained from running .

Given any policy and realization , let denote the subset of items selected by under realization . The expected utility of a policy is

(1)

Let denote the set of feasible policies which satisfy some given constraints such as the cardinality constraint. Our goal is to find a feasible policy that maximizes the expected utility:

2.3 Adaptive Submodularity, Monotonicity and Fully Adaptive Submodularity

We start by introducing two notations. [Conditional Expected Marginal Utility of an Item] For any partial realization and any item , the conditional expected marginal utility of conditioned on is

where the expectation is taken over with respect to .

[Conditional Expected Marginal Utility of a Policy] For any partial realization and any item , the conditional expected marginal utility of a policy conditioned on is

where the expectation is taken over with respect to .

We next introduce two important concepts proposed in (Golovin and Krause 2011b), adaptive submodularity and adaptive monotonicity. (Golovin and Krause 2011b)[Adaptive Submodularity] A function is adaptive submodular with respect to a prior distribution , if for any two partial realizations and such that , the following holds:

(Golovin and Krause 2011b) [Adaptive Monotonicity] A function is adaptive monotone with respect to a prior distribution , if for any realization , the following holds:

At last, we introduce the concept of fully adaptive submodularity, a stricter condition than the adaptive submodularity.

[Fully Adaptive Submodularity] For any subset of items and any integer , let denote the set of policies which are allowed to select at most items only from . A function is fully adaptive submodular with respect to a prior distribution , if for any two partial realizations and such that , and any subset of items and any , the following holds:

In the above definition, can be any single item, thus any fully adaptive submodular function must be adaptive submodular according to Definition 2.3.

3 Adaptive Stochastic Greedy Policy

We first study the adaptive submodular maximization problem subject to a cardinality constraint. Under this setting, we say a policy is feasible if it selects at most items under any realization. Formally, let denote the set of all feasible policies subject to a cardinality constraint , our goal is to find a policy from that maximizes the expected utility.

We next present our adaptive policy Adaptive Stochastic Greedy, denoted by . The details of our algorithm are listed in Algorithm 1. Similar to the classic adaptive greedy algorithm, runs round by round: Starting with an empty set and at each round, it selects one item that maximizes the expected marginal utility based on the current observation. What is different from the adaptive greedy algorithm, however, is that at each round , first samples a set of size uniformly at random and then selects the item with the largest conditional expected marginal utility from . Our approach is a natural extension of the Stochastic Greedy algorithm (Mirzasoleiman et al. 2015), the first linear-time algorithm for maximizing a monotone submodular function under the non-adaptive setting, we generalize their results to the adaptive setting. We will show that achieves approximation ratio for maximizing an adaptive monotone and submodular function, and it has linear running time independent of the cardinality constraint . It was worth noting that the technique of lazy updates (Minoux 1978) can be used to further accelerate the computation of our algorithms in practice.

1:  .
2:  while  do
3:     observe ;
4:      a random set sampled uniformly at random from ;
5:     ;
6:     ; ;
7:  return  
Algorithm 1 Adaptive Stochastic Greedy

We next present the main theorem. If is adaptive submodular and adaptive monotone, then the Adaptive Stochastic Greedy policy achieves a approximation ratio in expectation with function evaluations.

Proof: To prove this theorem, we follow an argument similar to the one from (Mirzasoleiman et al. 2015) for the proof of Theorem 1, but extended to the adaptive setting. We first prove the time complexity of . Recall that we set the size of to , thus the total number of function evaluations is at most .

Before proving the approximation ratio of , we first provide a preparation lemma as follows. Given any partial realization , let denote the largest items in terms of the expected marginal utility conditioned on having observed . Assume is sampled uniformly at random from , and the size of is . We have .

The proof of Lemma 1 is moved to Appendix. Given Lemma 1 in hand, now we are ready to bound the approximation ratio of . Let denote the first items selected by , and denote the partial realization observed till round , e.g., . Our goal is to estimate the increased utility during one round of our algorithm where denotes a policy that runs until it selects items. Given a partial realization at round , denote by the largest items in terms of the expected marginal utility conditioned on having observed . Recall that is sampled uniformly at random from , each item in is equally likely to be contained in . Moreover, the size of is set to . It follows that

where the inequality is due to Lemma 1. Then we have

where the second inequality is due to Lemma 6 in (Golovin and Krause 2011b). Therefore

Then we have

(2)
(3)
(4)

Note that the first expectation is taken over two sources of randomness: one source is the randomness in choosing , and the other source is the randomness in the partial realization . The second inequality is due to is adaptive monotone. Based on (4), we have using induction.

4 Generalized Adaptive Stochastic Greedy Policy

We next study the fully adaptive submodular maximization problem subject to a partition matroid constraint. Let be a collection of disjoint subsets of . Given a set of integers , define as the set of all feasible policies. Our goal is to find a feasible policy that maximizes an adaptive monotone and fully adaptive submodular function.

4.1 Locally Greedy Policy

Before presenting our linear-time algorithm, we first introduce a Locally Greedy policy . The basic idea of is as follows: Starting with an empty set, first selects number of items from in a greedy manner, i.e., iteratively adds items that maximize the conditional expected marginal utility conditioned on the realizations of already selected items; then selects number of items from in the same greedy manner, and so on. Note that does not rely on any specific ordering of , and this motivates the term “locally greedy” that we use to name this policy. We can view as an adaptive version of the locally greedy algorithm proposed in (Fisher et al. 1978).

We first introduce some additional concepts. Given a policy , for any and , we define 1) its level--truncation as a policy that runs until it selects items from and 2) its strict level--truncation as a policy that runs until right before it selects the -th item from . It is clear that .

We next show that if is adaptive submodular and adaptive monotone, the expected utility of is at least half of the optimal solution. If is adaptive submodular and adaptive monotone, then .

Proof: Consider a policy that runs first, then runs the strict level--truncation of . Let denote the partial realization obtained after running . Based on this notation, we use to denote a subrealization of obtained after running the strict level--truncation of . Conditioned on , assume selects as the -th item from , and selects as the -th item from . We first bound the expected marginal utility of conditioned on having observed .

(5)
(6)
(7)
(8)

The first equality is due to selects an item that maximizes the conditional expected marginal utility. The first inequality is due to the assumption that is adaptive submodular.

Taking the expectation of over , we have

(9)
(10)
(11)

where (10) is due to (8). Because for any policy , we have , it follows that

(12)
(13)
(14)
(15)

The first inequality is due to (11), and the second inequality is due to the assumption that is adaptive monotone. Then we have .

Remark: We believe that Lemma 4.1 is of independent interest in the field of adaptive submodular maximization under matroid constraints. In particular, Golovin and Krause (2011a) shows that a classic adaptive greedy algorithm, which iteratively selects an item with the largest contribution to the previously selected items, achieves an approximation ratio . This approximation factor is nearly optimal. When applied to our problem, their algorithm needs function evaluations. Since our locally greedy policy does not rely on any specific ordering of , it requires function evaluations.

4.2 Generalized Adaptive Stochastic Greedy Policy

Although requires less function evaluations than the classic greedy algorithm, its runtime is still dependent on , the cardinality constraint for each group . We next present a linear-time adaptive policy Generalized Adaptive Stochastic Greedy, denoted , whose runtime is independent of . The basic idea of is similar to , except that we leverage the adaptive stochastic greedy policy to select a group of items from for each . A detailed description of can be found in Algorithm 2. We next brief the idea of : Starting with an empty set, first selects number of items from using , i.e., first sampling a random set with size uniformly at random from , then selects an item with the largest conditional expected marginal utility from , selecting the next item from a newly sampled set in the same greedy manner, and so on; then selects number of items from using , where the size of the random set is set to , conditioned on the current observation, and so on. Similar to , does not rely on any specific ordering of .

1:  .
2:  while  do
3:     while  do
4:        observe the current partial realization ;
5:         a random set, with size , sampled uniformly at random from ;
6:        ;
7:        ; ; ; ;
8:  return  
Algorithm 2 Generalized Adaptive Stochastic Greedy

We next analyze the performance bound of . We start by showing that the expected utility of is at least times the expected utility of . If is full adaptive submodular and adaptive monotone, then . Proof: For ease of presentation, we use (resp. ) to denote a policy that runs (resp. ) until it selects all items from . Consider a policy that runs first, then runs . Let denote the partial realization obtained after running . We use to denote a subrealization of obtained after running . For a fixed , let denote a set of polices which are allowed to select at most items only from . For a given , we first bound the conditional expected marginal utility of the items selected from by . Let ,

(16)
(17)
(18)
(19)

The first inequality is due to Theorem 1 and the fact that we use to select items from each group . The second inequality is due to is adaptive monotone.

Taking the expectation of over , we have

(20)
(21)
(22)

The inequality is due to (19).

Because and , we have

(23)
(24)
(25)
(26)

The first inequality is due to (22) and the second inequality is due to is adaptive monotone. It follows that .

We next present the main theorem and show that the approximation ratio of is at least and its running time is (number of function evaluations). If is fully adaptive submodular and adaptive monotone, then , and uses at most function evaluations.

Proof: Lemma 4.1 and Lemma 2 together imply that . Recall that in Algorithm 2, we set the size of the random set to , thus the total number of function evaluations is .

5 Conclusion

In this paper, we develop the first linear-time algorithms for the adaptive submodular maximization problem subject to a cardinality constraint. We also study the fully adaptive submodular maximization problem subject to a partition matroid constraint, and develops a linear-time algorithm whose approximation ratio is a constant. In the future, we would like to develop fast algorithms subject to more general constraints such as knapsack constraint and general matroid constraints.

Appendix

Proof of Lemma 1: We first provide a lower bound the probability that .

(27)

It follows that . Because we assume , we have

(28)

References

  • A. Badanidiyuru and J. Vondrák (2014) Fast algorithms for maximizing submodular functions. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pp. 1497–1514. Cited by: §1.
  • E. Balkanski and Y. Singer (2018) The adaptive complexity of maximizing a submodular function. In

    Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

    ,
    pp. 1138–1151. Cited by: §1.
  • C. Chekuri and K. Quanrud (2019) Parallelizing greedy for submodular set function maximization in matroids and beyond. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pp. 78–89. Cited by: §1.
  • Y. Chen and A. Krause (2013) Near-optimal batch mode active learning and adaptive submodular optimization.. ICML (1) 28 (160-168), pp. 8–1. Cited by: §1.
  • A. Ene and H. L. Nguyen (2018) Towards nearly-linear time algorithms for submodular maximization with a matroid constraint. arXiv preprint arXiv:1811.07464. Cited by: §1.
  • M. L. Fisher, G. L. Nemhauser, and L. A. Wolsey (1978) An analysis of approximations for maximizing submodular set functions ii. In Polyhedral combinatorics, pp. 73–87. Cited by: §4.1.
  • K. Fujii and S. Sakaue (2019) Beyond adaptive submodularity: approximation guarantees of greedy policy with adaptive submodularity ratio. In

    International Conference on Machine Learning

    ,
    pp. 2042–2051. Cited by: §1.
  • D. Golovin and A. Krause (2011a) Adaptive submodular optimization under matroid constraints. arXiv preprint arXiv:1101.4450. Cited by: §4.1.
  • D. Golovin and A. Krause (2011b) Adaptive submodularity: theory and applications in active learning and stochastic optimization.

    Journal of Artificial Intelligence Research

    42, pp. 427–486.
    Cited by: §1, §1, §2.3, §2.3, §3.
  • A. Krause and C. Guestrin (2007) Near-optimal observation selection using submodular functions. In AAAI, Vol. 7, pp. 1650–1654. Cited by: §1.
  • J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance (2007) Cost-effective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 420–429. Cited by: §1.
  • M. Minoux (1978) Accelerated greedy algorithms for maximizing submodular set functions. In Optimization techniques, pp. 234–243. Cited by: §3.
  • B. Mirzasoleiman, A. Badanidiyuru, A. Karbasi, J. Vondrák, and A. Krause (2015) Lazier than lazy greedy. In Twenty-Ninth AAAI Conference on Artificial Intelligence, Cited by: §1, §1, §3, §3.
  • B. Mirzasoleiman, A. Badanidiyuru, and A. Karbasi (2016) Fast constrained submodular maximization: personalized data summarization.. In ICML, pp. 1358–1367. Cited by: §1.
  • G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher (1978) An analysis of approximations for maximizing submodular set functions i. Mathematical programming 14 (1), pp. 265–294. Cited by: §1, §1.
  • S. Tang and J. Yuan (2020) Influence maximization with partial feedback. Operations Research Letters 48 (1), pp. 24–28. Cited by: §1, §1.
  • S. Tang (2020) Price of dependence: stochastic submodular maximization with dependent items.

    Journal of Combinatorial Optimization

    39 (2), pp. 305–314.
    Cited by: §1.
  • J. Yuan and S. Tang (2017a) Adaptive discount allocation in social networks. In Proceedings of the 18th ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp. 1–10. Cited by: §1.
  • J. Yuan and S. Tang (2017b) No time to observe: adaptive influence maximization with partial feedback. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 3908–3914. Cited by: §1.