Combinatorial Sleeping Bandits with Fairness Constraints

by   Fengjiao Li, et al.

The multi-armed bandit (MAB) model has been widely adopted for studying many practical optimization problems (resource allocation, ad placement, crowdsourcing, etc.) with unknown parameters. The goal of the player here is to maximize the cumulative reward in the face of uncertainty. However, the basic MAB model neglects several important factors of the system in many real-world applications, where multiple arms can be simultaneously played and an arm could sometimes be "sleeping". Besides, ensuring fairness is also a key design concern in practice. To that end, we propose a new Combinatorial Sleeping MAB model with Fairness constraints, called CSMAB-F, aiming to address the aforementioned crucial modeling issues. The objective is now to maximize the reward while satisfying the fairness requirement of a minimum selection fraction for each individual arm. To tackle this new problem, we extend an online learning algorithm, called UCB, to deal with a critical tradeoff between exploitation and exploration and employ the virtual queue technique to properly handle the fairness constraints. By carefully integrating these two techniques, we develop a new algorithm, called Learning with Fairness Guarantee (LFG), for the CSMAB-F problem. Further, we rigorously prove that not only LFG is feasibility-optimal, but it also has a time-average regret upper bounded by N/2η+β_1√(mNTT)+β_2 N/T, where N is the total number of arms, m is the maximum number of arms that can be simultaneously played, T is the time horizon, β_1 and β_2 are constants, and η is a design parameter that we can tune. Finally, we perform extensive simulations to corroborate the effectiveness of the proposed algorithm. Interestingly, the simulation results reveal an important tradeoff between the regret and the speed of convergence to a point satisfying the fairness constraints.


Thompson Sampling for Combinatorial Semi-bandits with Sleeping Arms and Long-Term Fairness Constraints

We study the combinatorial sleeping multi-armed semi-bandit problem with...

Decentralized Stochastic Multi-Player Multi-Armed Walking Bandits

Multi-player multi-armed bandit is an increasingly relevant decision-mak...

Contextual Combinatorial Volatile Bandits with Satisfying via Gaussian Processes

In many real-world applications of combinatorial bandits such as content...

Stochastic Multi-armed Bandits with Arm-specific Fairness Guarantees

We study an interesting variant of the stochastic multi-armed bandit pro...

Achieving Fairness in the Stochastic Multi-armed Bandit Problem

We study an interesting variant of the stochastic multi-armed bandit pro...

Data-Driven Bandit Learning for Proactive Cache Placement in Fog-Assisted IoT Systems

In Fog-assisted IoT systems, it is a common practice to cache popular co...

Planning to Fairly Allocate: Probabilistic Fairness in the Restless Bandit Setting

Restless and collapsing bandits are commonly used to model constrained r...

Please sign up or login with your details

Forgot password? Click here to reset