Improved Regret Bounds for Online Submodular Maximization
In this paper, we consider an online optimization problem over T rounds where at each step t∈[T], the algorithm chooses an action x_t from the fixed convex and compact domain set 𝒦. A utility function f_t(·) is then revealed and the algorithm receives the payoff f_t(x_t). This problem has been previously studied under the assumption that the utilities are adversarially chosen monotone DR-submodular functions and 𝒪(√(T)) regret bounds have been derived. We first characterize the class of strongly DR-submodular functions and then, we derive regret bounds for the following new online settings: (1) {f_t}_t=1^T are monotone strongly DR-submodular and chosen adversarially, (2) {f_t}_t=1^T are monotone submodular (while the average 1/T∑_t=1^T f_t is strongly DR-submodular) and chosen by an adversary but they arrive in a uniformly random order, (3) {f_t}_t=1^T are drawn i.i.d. from some unknown distribution f_t∼𝒟 where the expected function f(·)=𝔼_f_t∼𝒟[f_t(·)] is monotone DR-submodular. For (1), we obtain the first logarithmic regret bounds. In terms of the second framework, we show that it is possible to obtain similar logarithmic bounds with high probability. Finally, for the i.i.d. model, we provide algorithms with 𝒪̃(√(T)) stochastic regret bound, both in expectation and with high probability. Experimental results demonstrate that our algorithms outperform the previous techniques in the aforementioned three settings.
READ FULL TEXT