Adaptive Regularized Submodular Maximization
In this paper, we study the problem of maximizing the difference between an adaptive submodular (revenue) function and an non-negative modular (cost) function under the adaptive setting. The input of our problem is a set of n items, where each item has a particular state drawn from some known prior distribution p. The revenue function g is defined over items and states, and the cost function c is defined over items, i.e., each item has a fixed cost. The state of each item is unknown initially, one must select an item in order to observe its realized state. A policy π specifies which item to pick next based on the observations made so far. Denote by g_avg(π) the expected revenue of π and let c_avg(π) denote the expected cost of π. Our objective is to identify the best policy π^o∈max_πg_avg(π)-c_avg(π) under a k-cardinality constraint. Since our objective function can take on both negative and positive values, the existing results of submodular maximization may not be applicable. To overcome this challenge, we develop a series of effective solutions with performance grantees. Let π^o denote the optimal policy. For the case when g is adaptive monotone and adaptive submodular, we develop an effective policy π^l such that g_avg(π^l) - c_avg(π^l) ≥ (1-1/e-ϵ)g_avg(π^o) - c_avg(π^o), using only O(nϵ^-2logϵ^-1) value oracle queries. For the case when g is adaptive submodular, we present a randomized policy π^r such that g_avg(π^r) - c_avg(π^r) ≥1/eg_avg(π^o) - c_avg(π^o).
READ FULL TEXT