In this paper, we study a new stochastic submodular maximization problem. We introduce the state-dependent item costs and rejections into the classic stochastic submodular maximization problem. The input of our problem is a budget constraint , and a set of items whose states are drawn from a known probability distribution. The marginal contribution and the cost of an item is dependent on its actual state. We must probe an item in order to reveal its actual state. Our model allows rejections, i.e., after probing an item and knowing its actual state, we must decide immediately and irrevocably whether to add that item to our solution or not. Our objective is to select a group of items that maximize the objective function subject to a budget constraint on the total cost of the selected items. We present a constant approximate solution to this problem. Perhaps surprisingly, we show that our algorithm can also be applied to an online setting described as follows: Suppose there is a sequence of items arriving with different states, on the arrival of an item, we must decide immediately and irrevocably whether to select it or not subject to a budget constraint, and the objective is to maximize the objective function. For this online decision problem, our algorithm achieves the same approximation ratio as obtained under the offline setting.
Related works. Stochastic submodular maximization has been extensively studied recently (Golovin and Krause 2011, Chen and Krause 2013, Fujii and Kashima 2016). However, most of existing works assume that the cost of an item is deterministic and pre-known. We relax this assumption by introducing the state-dependent item costs to our setting. In particular, we assume that the actual cost of an item is decided by its realized state. We must probe an item in order to know its state. When considering linear objective function, our problem reduces to the stochastic knapsack problem with rejections (Gupta et al. 2011). Gupta et al. (2011) gave a constant approximate algorithm for this problem. Recently, (Fukunaga et al. 2019) studied the stochastic submodular maximization problem with performance-dependent costs, however, their model does not allow rejections. Therefore, our problem does not coincide with their work. Moreover, it is not immediately clear how to extend their algorithm to online setting. Our work is also closely related to submodular probing problem (Adamczyk et al. 2016), however, they assume each item has only two states, i.e., active or inactive, we relax this assumption by allowing each item to have multiple states and the item cost is dependent on its state. Furthermore, their model does not allow rejections, i.e., one can not reject an active item after it has been probed.
2 Preliminaries and Problem Formulation
Let be a set of items and
be a set of states. Given two vectors, means that for all . Define and . For , define as the vector that has a in the -th coordinate and in all other coordinates. A function is called monotone if holds for any such that , and is called lattice submodular if holding for any , , .
Items and States
We let vector denote the states of all items. For each item , let denote the state of
. We assume there is a known prior probability distributionover realizations for each item , i.e., . The states of all items are decided independently at random, i.e., is drawn randomly from the product distribution . We use to denote the cost of an item when its state is .
We further assume that for any and such that , i.e., the cost of an item is larger if it is in a “better” state.
For ease of analysis, we next introduce a set function over a new ground set : consider an arbitrary set of item-state pairs , define where if , otherwise . It is easy to verify that if is monotone and lattice-submodular, then is monotone and submodular. Given an matrix , we define the multilinear extension of as:
The value is the expected value of where is a random set obtained by picking each element independently with probability .
Adaptive Policy and Problem Formulation
The input of our problem is a budget constraint , and a set of items. The only way to know the state of an item is to probe the item. We allow rejections, i.e., after probing an item and knowing its state, we must decide immediately and irrevocably whether to pick that item or not. We model the adaptive strategy of probing/picking items through a policy . Formally, a policy is a function that specifies which item to probe/pick next based on the observations made so far. Consider an arbitrary policy , assume that conditioned on , picks a set of items (and corresponding states) 111For simplicity, we only consider deterministic policy. However, all results can be easily extended to random policies.. The expected utility of is . We say a policy is feasible if for any , . Our goal is to identify the best feasible policy that maximizes its expected utility.
3 Algorithm Design
We next describe our algorithm and analyze its performance. Our algorithm is based on the contention resolution scheme (Chekuri et al. 2014), which is proposed in the context of submodular maximization with deterministic item cost. We extend their design by considering state-dependent item cost and rejections. Our algorithm, called StoCan, is composed of two phases.
The first phase is done offline, we use the continuous greedy algorithm (Algorithm 1) to compute a fractional solution over a down monotone polytope. The framework of continuous greedy algorithm is first proposed by Calinescu et al. (2011) in the context of submodular maximization subject to a matroid constraint. In particular, Algorithm 1 maintains an matrix , starting with . Let contain each independently with probability . For each
, estimate its weightas follows
Solve the following linear programming problemLP and obtain the optimal solution , then update the fractional solution at round as .
After rounds, is returned as the final solution. In the rest of this paper, let denote for short.
In the second phase, we implement a simple randomized policy based on . Our policy randomly picks a policy from (Algorithm 2) and (Algorithm 3) with equal probability to execute. If is picked, we discard all large items whose cost is larger than (Line 5 in Algorithm 2), and add the rest of items according to the corresponding distribution in (scaled) (Line 8 in Algorithm 2) . If Algorithm is picked, we discard all small items whose cost is no larger than (Line 5 in Algorithm 2), and add the rest of items according to the corresponding distribution in (scaled) (Line 8 in Algorithm 3).
We next provide the main theorem of this paper.
Let denote the optimal policy, the expected utility achieved by StoCan is at lest .
Theorem 1 follows from Lemma 1, 2, 3, and 4, which will be proved later. Based on Lemma 3 and 4, we have . Together with Lemma 2, we have . Because as proved in Lemma 1, we have . Since StoCan randomly picks one policy from and to execute, the expected utility of StoCan is at least .
Next we focus on proving the above four lemmas.
Let denote the fractional solution returned from Algorithm 1, .
Given , for each item-state pair , let denote the probability that and is picked by . Clearly, . Moreover, let denote the event that and is picked by , we have , Thus, where the expectation is taken over . It follows that is a feasible solution to LP. Define as the matrix that has a in the -th entry and in all other entries. Let and denote the marginal utility of with respect to and , respectively. We next bound the increment of during one step of Algorithm 1.
The first inequality is proved in (Calinescu et al. 2011). The third inequality is due to is an optimal solution to LP. Then this lemma follows from the standard analysis on submodular maximization. ∎
Given the fractional solution returned from Algorithm 1, we next introduce two new fractional solutions and . Define if , otherwise, . Define if , otherwise, . Due to the submodularity of , we have the following lemma.
We next bound the expected utility achieved by .
Consider a modified version of by removing Line 8, i.e., after probing an item and observing its state , if , select with probability regardless of the remaining budget, otherwise, discard . Denote by the returned solution from the modified . It is easy to verify that for each with , the probability that is included in is . Notice that since each item can only have one state, the event that is included in is not independent from the event that is included in where is a different state from and . However, as shown in Lemma 3.7 in (Calinescu et al. 2011), this dependency does not degrade the expected utility, i.e., . Due to is concave along any nonnegative direction (Calinescu et al. 2011), we have . It follows that
Recall that if the remaining budget is no less than , adds to . Because is a feasible solution to LP, is also a feasible solution to LP, it implies that . According to Markov’s inequality, the probability that the remaining budget is less than is at most . Because we assume , the probability that the remaining budget is less than is at most . Thus, the probability that is included in is at least .
Let (resp. ) denote all item-state pairs in (resp. ) that involve items in , i.e., and . We next prove that for any ,
We first give an lower bound on . Let denote the event that is included in and denote the event that is included in .
The first inequality is due the submodularity of . The second inequality is due to the assumption that for any and such that , and KFG inequality.
We next given an upper bound on .
It follows that . ∎
Now consider the second option . In the following lemma, we prove that the expected utility achieved by is at least .
Because is a feasible solution to LP, is also a feasible solution to LP, it implies that . Since we only consider those whose cost is larger than , the probability that is at least . Consider any , conditioned on , the probability that is included in is at least . Thus, the probability that is included in is at least . Recall that only picks large items, contains at most one item (and its state) due to budget constraint. Thus, the expected utility of is at least . Due to the submodularity of and Lemma 3.7 in (Calinescu et al. 2011), we have . Since is concave along any nonnegative direction (Calinescu et al. 2011), we have , thus . ∎
4 Discussion on Online Setting
One nice feature about StoCan is that the implementation of and does not require any specific order of items. Therefore, StoCan can also be applied to an online setting described as follows: Suppose there is a sequence of items arriving with different states, on the arrival of an item, we observe its state and decide immediately and irrevocably whether to select it or not subject to a budget constraint. Similar to the offline setting, StoCan first computes using Algorithm 1 in advance, then randomly picks one policy from and to execute. Notice that the online version of and probes the items in order of their arrival. It is easy to verify that this does not affect the performance analysis of StoCan, i.e., our analysis does not rely on any specific order of items, thus StoCan achieves the same approximation ratio as obtained under the offline setting.
In this paper, we study the stochastic submodular maximization problem with state-dependent costs and rejections. We present a constant approximate solution to this problem. We show that our solution is also applied to an online setting.
- Adamczyk et al. (2016) Adamczyk, Marek, Maxim Sviridenko, Justin Ward. 2016. Submodular stochastic probing on matroids. Mathematics of Operations Research 41(3) 1022–1038.
- Calinescu et al. (2011) Calinescu, Gruia, Chandra Chekuri, Martin Pál, Jan Vondrák. 2011. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing 40(6) 1740–1766.
- Chekuri et al. (2014) Chekuri, Chandra, Jan Vondrák, Rico Zenklusen. 2014. Submodular function maximization via the multilinear relaxation and contention resolution schemes. SIAM Journal on Computing 43(6) 1831–1879.
Chen and Krause (2013)
Chen, Yuxin, Andreas Krause. 2013.
Near-optimal batch mode active learning and adaptive submodular optimization.ICML (1) 28(160-168) 8–1.
- Fujii and Kashima (2016) Fujii, Kaito, Hisashi Kashima. 2016. Budgeted stream-based active learning via adaptive submodular maximization. Advances in Neural Information Processing Systems. 514–522.
Fukunaga et al. (2019)
Fukunaga, Takuro, Takuya Konishi, Sumio Fujita, Ken-ichi Kawarabayashi. 2019.
Stochastic submodular maximization with performance-dependent item
Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33. 1485–1494.
- Golovin and Krause (2011) Golovin, Daniel, Andreas Krause. 2011. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. Journal of Artificial Intelligence Research 42 427–486.
- Gupta et al. (2011) Gupta, Anupam, Ravishankar Krishnaswamy, Marco Molinaro, R Ravi. 2011. Approximation algorithms for correlated knapsacks and non-martingale bandits. 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science. IEEE, 827–836.