Stochastic Bandits for Multi-platform Budget Optimization in Online Advertising
We study the problem of an online advertising system that wants to optimally spend an advertiser's given budget for a campaign across multiple platforms, without knowing the value for showing an ad to the users on those platforms. We model this challenging practical application as a Stochastic Bandits with Knapsacks problem over T rounds of bidding with the set of arms given by the set of distinct bidding m-tuples, where m is the number of platforms. We modify the algorithm proposed in Badanidiyuru et al., to extend it to the case of multiple platforms to obtain an algorithm for both the discrete and continuous bid-spaces. Namely, for discrete bid spaces we give an algorithm with regret O(OPT √(mn/B)+ √(mn OPT)), where OPT is the performance of the optimal algorithm that knows the distributions. For continuous bid spaces the regret of our algorithm is Õ(m^1/3·min{ B^2/3, (m T)^2/3}). When restricted to this special-case, this bound improves over Sankararaman and Slivkins in the regime OPT ≪ T, as is the case in the particular application at hand. Second, we show an Ω (√(m OPT)) lower bound for the discrete case and an Ω( m^1/3 B^2/3) lower bound for the continuous setting, almost matching the upper bounds. Finally, we use a real-world data set from a large internet online advertising company with multiple ad platforms and show that our algorithms outperform common benchmarks and satisfy the required properties warranted in the real-world application.
READ FULL TEXT