Sequential Fair Resource Allocation under a Markov Decision Process Framework

by   Parisa Hassanzadeh, et al.

We study the sequential decision-making problem of allocating a limited resource to agents that reveal their stochastic demands on arrival over a finite horizon. Our goal is to design fair allocation algorithms that exhaust the available resource budget. This is challenging in sequential settings where information on future demands is not available at the time of decision-making. We formulate the problem as a discrete time Markov decision process (MDP). We propose a new algorithm, SAFFE, that makes fair allocations with respect to the entire demands revealed over the horizon by accounting for expected future demands at each arrival time. The algorithm introduces regularization which enables the prioritization of current revealed demands over future potential demands depending on the uncertainty in agents' future demands. Using the MDP formulation, we show that SAFFE optimizes allocations based on an upper bound on the Nash Social Welfare fairness objective, and we bound its gap to optimality with the use of concentration bounds on total future demands. Using synthetic and real data, we compare the performance of SAFFE against existing approaches and a reinforcement learning policy trained on the MDP. We show that SAFFE leads to more fair and efficient allocations and achieves close-to-optimal performance in settings with dense arrivals.


page 1

page 2

page 3

page 4


MONEYBaRL: Exploiting pitcher decision-making using Reinforcement Learning

This manuscript uses machine learning techniques to exploit baseball pit...

Reinforcement Learning When All Actions are Not Always Available

The Markov decision process (MDP) formulation used to model many real-wo...

Optimal Admission Control for Multiclass Queues with Time-Varying Arrival Rates via State Abstraction

We consider a novel queuing problem where the decision-maker must choose...

Proactive Resource Management in LTE-U Systems: A Deep Learning Perspective

LTE in unlicensed spectrum (LTE-U) is a promising approach to overcome t...

Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing

Sequential incentive marketing is an important approach for online busin...

Sequential Dynamic Resource Allocation for Epidemic Control

Under the Dynamic Resource Allocation (DRA) model, an administrator has ...

Please sign up or login with your details

Forgot password? Click here to reset