In this paper, we study the problem of allocating indivisible items so that the minimum happiness among agents is maximized. Let us consider a toy instance. Suppose that Alice and Bob are trying to share bite-sized snacks that arrive sequentially. As soon as each snack arrives, one of them will receive and eat it. If each snack is picked by the one who values it more than the other, the outcome will become an imbalanced one (Table 2). In contrast, if they pick the items alternately, the outcome will become an inefficient one (Table 2). The question then arises as to what kind of rule would satisfy fairness and efficiency simultaneously, and moreover, what would be the best possible rule.
The fair allocation of resources or items to agents has been a central problem in economic theory for several decades. In classical fair allocation problems, we are given all the items in advance. Recently, the problem of allocating items in an online fashion has been studied in the areas of combinatorial optimization, algorithmic game theory, and artificial intelligence. In online problems, indivisible items arrive one by one, and they need to be allocated immediately and irrevocably to agents. The study of online fair allocation is motivated by its wide range of applications such as the allocation of donor organs to patients, donated food to charities, electric vehicles to charging stations; we refer the reader to the survey for details.
Throughout the paper, we denote the sets of agents and indivisible items by and , respectively.
We use the symbol to denote .
Each agent has a valuation function that assigns a value to each item.
For simplicity, unless otherwise stated, we assume that the value of each item is normalized to .
We assume that each agent has an additive preference over the items, and we write to denote the utility of agent when obtains .
For an item , we call the value vector
value vectorof . An allocation is a partition of (i.e., and for any distinct ). For , we denote and .
Our goal is to find an allocation that maximizes the minimum utility among the agents . The value is called the egalitarian social welfare of allocation . The problem of maximizing the egalitarian social welfare when items arrive one by one is called the online max-min fair allocation problem. Here, we assume that the number of items is unknown in advance. The max-min fairness (that is, the egalitarian social welfare is maximized) is one of the most commonly used notions for measuring fairness and efficiency, and it has been studied extensively in the area of fair allocation [23, 8, 16, 27, 25]. Thus, our problem naturally models the above applications using the notion of max-min fairness. We measure the performance of online algorithms using the competitive ratio, which is the ratio of the egalitarian social welfare obtained by an online algorithm to that of the offline optimal value. Furthermore, we consider two types of competitive ratio: strict and asymptotic. In the strict setting, we consider the worst-case ratio for every possible input sequence, whereas in the asymptotic setting, we consider the worst-case ratio for input sequences with sufficiently large optimal values. Section 2 presents the formal definitions for these terms. Note that the asymptotic competitive ratio represents an intrinsic performance ratio that does not depend on initial behavior. We consider two arrival models: adversarial, in which the items are chosen arbitrarily, and independent and identically distributed (i.i.d.)
, in which the value vectors of the items are drawn independently from an unknown/known distribution. Note that a value vector can be a continuous random variable in the i.i.d. arrival model.
1.1 Related work
A class of the online max-min fair allocation problem with identical agents (i.e., ) has also been studied as the online machine covering problem in the context of scheduling [18, 30, 9, 29, 21, 22]. Here, an agent’s utility corresponds to a machine load. The problem of maximizing the minimum machine load was initially motivated by modeling the sequencing of maintenance actions for modular gas turbine aircraft engines . For this case, it is known that any online deterministic algorithm has a strict competitive ratio of at most and that the greedy algorithm is strictly -competitive . Besides, there exists a strictly -competitive randomized algorithm, which is a best possible algorithm up to logarithmic factors .
In addition to the online max-min fair allocation problem, online fair allocation problems with other fairness and efficiency notions have been studied [1, 3, 2, 4, 28, 12, 5, 6, 14]. For example, Benade et al.  focused on an online problem of allocating all the indivisible items to minimize the maximum envy. They designed a deterministic online algorithm such that the maximum envy is sublinear with respect to the number of items; the algorithm outputs an allocation such that for any . Unlike our setting, they assumed that the number of items is known in advance. Their algorithm is based on a random allocation, where each item is allocated to an agent chosen uniformly at random. In 
, the authors first prove that the maximum envy in the allocation obtained by the random allocation algorithm is sublinear. Then, they derandomized the algorithm by using a potential function that pessimistically estimates the future allocation. For more models of online fair allocation, see for a comprehensive survey.
The offline version of the max-min allocation problem has also been studied under the name of the Santa Clause problem [13, 23, 20, 17, 24]. The problem is NP-hard even to approximate within a factor of better than . Bansal and Sviridenko  proposed an -approximation algorithm for the restricted case when for all and . Asadpour and Saberi  provided the first polynomial-time approximation algorithm for the general problem, which was improved by Haeupler et al. .
1.2 Our results
Although the online max-min fair allocation problem is a fundamental problem, almost nothing is known about the competitive analysis for nonidentical agents to the best of our knowledge.
Our main results show the asymptotic competitive ratios of optimal online algorithms for the adversarial and i.i.d. arrival models. In addition, we roughly identified the strict competitive ratios of optimal online algorithms, which are much smaller than those of the asymptotic ones. We summarize our results in Table 3.
|Adversarial (det.)||Adversarial (rand.)||Unknown i.i.d.||Known i.i.d.|
|Strict||(Thm. 5.3)||(Thms. 3.2, 5.4)||(Thms. 3.2, 6.1)||(Thms. 3.2, 6.1)|
|Asympt.||(Thms. 3.3, 5.1)||(Thms. 3.3, 5.1)||(Thm. 4.1)||(Thm. 4.1)|
1.2.1 Adversarial arrival model
A main result for the adversarial arrival model is a polynomial-time deterministic algorithm with an asymptotic competitive ratio of nearly (Theorem 3.3), which is the best possible.
We first observe an impossibility that the asymptotic competitive ratio is at most (Theorem 5.1). Thus, our aim is to construct an asymptotically -competitive algorithm. If randomization is allowed, we can achieve it by simply allocating each item to an agent chosen uniformly at random. We refer to this randomized algorithm as Random. Note that Random is not strictly
-competitive because the expected value of the minimum of random variables isnot equal to the minimum of the expected values of random variables. We show that Random guarantees even for the adaptive-offline111The adaptive-offline adversary chooses the next item based on the allocation chosen by the online algorithm thus far, and it obtains an offline optimal value for the resulting request items. adversary, where is the offline optimal value (Theorem 3.1). Interestingly, this fact implies the existence of a deterministic algorithm with the same guarantee . However, the construction is not obvious. In fact, natural greedy algorithms are far from asymptotically -competitive (Theorem 0.A.1 in Appendix). Moreover, the natural round-robin procedure222In an offline setting, a round-robin procedure implies that agents take turns and choose their most preferred unallocated item. However, because we are dealing with online setting, we use this term to refer to a procedure in which the th item is taken by agent . fails. One disadvantage of these algorithms is that they output allocations that are too imbalanced and too balanced, respectively. Moreover, it is unclear whether or not such a deterministic algorithm can be implemented to run in polynomial time.
We propose a novel derandomization method to obtain a polynomial-time deterministic algorithm with almost the same performance as Random. Our algorithm is based on the spirit of giving way to each other. Upon the arrival of an item, our algorithm gives agents a chance to take it in ascending order with respect to the valuation of the item. Each agent generously passes the chance in consideration of the agent’s past assigned units. Then, we can achieve the golden mean between allocations that are too balanced or too imbalanced, and we obtain the main result. We believe that this technique is novel and will have further applications. The advantage of our algorithm is that it does not require the information of the number of items nor an upper bound on the value of the items. In addition, our analysis produces a consequence on another fairness notion called proportionality (each of the agents receives a fraction at least of the entire items according to her valuation) in an asymptotic sense.
As an impossibility result, we prove a stronger bound for deterministic algorithms: no deterministic online algorithm can attain for any where is the offline optimal value (Theorem 5.2). This bound implies that the performance of Random is nearly optimal even when additive terms are taken into consideration.
1.2.2 Unknown/known i.i.d. arrival models
Our main result for the i.i.d. arrival models is to provide an algorithm that outputs an asymptotically near-optimal allocation. Our algorithm is the following simple one: upon the arrival of each item, allocate the item to the agent with the highest discounted value, where each agent’s value of the item is exponentially discounted with respect to the total value received so far. We prove that this algorithm with exponential base is -competitive if the expected optimal value is larger than a certain value (Theorem 4.1).
We remark that our algorithm is based on a similar idea found in Devanur et al. , but this is not a naive application. Devanur et al.  provided an asymptotically -competitive algorithm for a large class of resource allocation problems. However, we have two difficulties when applying their algorithm to our problem. One is that their algorithm requires the number of items to estimate the expected optimal value, but is unknown in our setting. The other is that the setting of 
deals with finite types of online items (i.e., each item is drawn from a discrete distribution) and their algorithm utilizes a linear programming (LP) solution; by contrast, in our setting, there may exist infinite types of value vectors (i.e., a distribution can be continuous). Our contribution is to resolve the above difficulties. In fact, we do not use the LP in the algorithm (unlike the ones in); we use it only in the analysis. This makes our algorithm quite simple. Note that our algorithm also does not require information about the total number of items nor an upper bound on the value of the items.
For the strict competitive ratio, we show that even for the known i.i.d. setting, the strict competitive ratio of any algorithm must be exponentially small with respect to the number of agents (Theorem 6.1).
The rest of this paper is organized as follows. We formally define competitive ratios in Section 2. We present our main algorithmic results for the adversarial and i.i.d. arrival models in Sections 3 and 4, respectively. Then, in Sections 5 and 6, we present the impossibility results, which complement the algorithmic results. We provide our concluding remarks in Section 7.
To evaluate the performance of online algorithms, we use strict and asymptotic competitive ratios. For an input sequence , let and respectively denote the egalitarian social welfares of the allocations obtained by an online algorithm and an optimal offline algorithm (here, is a random variable if is a randomized algorithm). Then, the strict competitive ratio and the asymptotic competitive ratio for the adversarial arrival model are defined as
respectively. Here, the competitive ratios for randomized algorithms are defined by using an oblivious adversary. The competitive ratios are at most , and the larger values indicate better performance. By the definition, the asymptotic competitive ratio of is at least if for any input sequence . Note that, in some literature (e.g., ), the asymptotic competitive ratio of is at least only when there is a constant such that for any input sequence . We refer to this as the classical definition.
For the i.i.d. arrival model, we consider the distribution of input sequences determined by a number of items and a distribution of value vectors . The strict competitive ratio and the asymptotic competitive ratio for the i.i.d. arrival model are similarly defined as
3 Algorithms for Adversarial Arrival
In this section, we provide algorithms for the adversarial arrival model. We first show a randomized algorithm that is asymptotically -competitive in Section 3.1 and then provide a deterministic algorithm with the same competitive ratio in Section 3.2.
3.1 Randomized Algorithm
A simple way to allocate items “fairly” is to allocate each item uniformly at random among all the agents. We refer to this randomized algorithm as Random. One might think that it would be better to choose an agent who has a positive valuation for an item. However, this does not perform better than Random in the worst case scenario. Furthermore, it turns out that Random is a nearly optimal algorithm for the adversarial arrival model.
First, we prove that the asymptotic competitive ratio of Random is at least by showing a slightly stronger statement.
For any adaptive adversary, Random satisfies
where is the input sequence chosen by the adversary (depending on the stochastic behavior of Random) and .
The adaptive adversary decides to request the next item or terminates depending on the sequence of allocation at each time so far. We use the symbol
to denote the next item when the allocation sequence at the moment is. Let denote the set of all allocation sequences such that the adversary requests the next item. For each , let be a random variable such that if Random allocates to agent , and otherwise. In addition, let be a random variable such that if is requested (i.e., the allocation sequence chosen by Random is at some moment), and otherwise. As the allocation is totally uniformly at random, we have for all , and , where denotes the length of (i.e., the number of items allocated so far).
The total utility of agent is , and the expected utility of is . Let for each , and let . Then, the expected optimal value is at most
We apply the Chernoff bound: since each () satisfies , we have
for all . By setting in (4), we see that
Furthermore, by the union bound, the probability thatholds for some is at most . Without loss of generality, we may assume that since we are analyzing asymptotic behavior. As is monotone increasing for , we obtain
In the classical definition of the asymptotic competitive ratio, Random is at least -competitive for any constant against adaptive-offline adversaries.
We also analyze the strict competitive ratio of Random. For the strict competitive ratio, a deterministic algorithm can do almost nothing, but Random attains fraction of the optimal value. Intuitively, this is because each agent obtains fraction of the value received in the optimal allocation with probability .
The strict competitive ratio of Random is at least in the adversarial arrival model.
Fix an input sequence , let be an offline optimum allocation for the instance and let be the random variable that indicates that is allocated to by Random. Note that . For , consider the event such that Random allocates at most fraction of in terms of her valuation (i.e., ). If none of occurs, then
In addition, by Markov’s inequality, we have
As the events are independent, we obtain
It is well-known that there is no advantage to use randomization against adaptive-offline adversaries with respect to the competitive ratio . This implies the existence of a deterministic algorithm with the same guarantee as Random. However, the proof is not constructive, and hence it is not straightforward to obtain such a deterministic algorithm. Moreover, there is no implication about running time.
A natural way to derandoimze Random is a simple round-robin. However, this fails due to the example in the Introduction (see Table 2). Another approach is to estimate the optimal value, but this is impossible in the adversarial setting. Moreover, we can prove that allocating new arriving item to the agent who maximize is not asymptotically -competitive for any monotone increasing function (see Appendix 0.A for more details).
Our approach is to classify items into (infinitely many) types and aim to allocate almost the same number of items of each type to each agent. Fixing a positive real, we denote , where . We define a type of an item as a vector . Note that an agent with a smaller has a higher valuation. Now, our task is to schedule the order of allocation for each type of items. If there are only agents, applying the round-robin procedure independently for each type (in which the first item is allocated to the agent who wants it more than the other) is asymptotically -competitive. However, in general, such a simple round-robin in a particular type may result in a too unbalanced allocation as shown in Table 5. Thus, we introduce a sophisticated procedure to avoid such an unbalanced allocation.
We describe our novel technique of derandomization. Suppose that the type of an arriving item is with . By the definition of , we have . Our algorithm gives agent , who has the smallest value for , a chance to receive . She obtains if she has passed previous chances to receive items of type , and passes the chance otherwise. If agent passed the chance, then the algorithm gives agent a chance. Agent obtains if she has passed previous chances to receive items of type with some . Note that can vary. For example, and if agent passes an item of type , and agent passes a next item that has type , then agent obtains the item. Our algorithm repeats this procedure. In general, if agents passed the chances, then the algorithm gives agent a chance to receive . Agent obtains if she has passed the previous chances to receive items of type with some . Note that the item is allocated to some agent. At least, agent obtains if she receives the chance. See Table 5 for an example of allocation by our algorithm. We present a formal description in Algorithm 1.
It is not difficult to see that Algorithm 1 can be implemented to run in polynomial-time. We prove the following statement.
For any positive real and any input sequence , Algorithm 1 returns an allocation such that for all where is the set of items requested in .
This theorem implies that Algorithm 1 is asymptotically -competitive because
The theorem also indicates that Algorithm 1 finds a nearly proportional allocation, i.e., each agent receives at least nearly -fraction of her valuation to the entire items.
To prove the theorem, we show that the allocation is almost balanced regarding the number of items. For a permutation , index , and with , we denote . We remark that forms a partition of the entire item set for every .
For any permutation , index , and with , it holds that
We only discuss the chances regarding the items in . The number of chances that agent receives is because the algorithm gives the chance to first for every item in . As takes at most fraction of the chances, the number of chances that agent receives is at least . Also, as takes at most fraction of the chances, the number of chances that agent receives is at least (if ). Continuing the same argument, we can conclude that the number of chances that agent receives is at least for every because whether she passes a chance or not is not affected by the items not in . As agent receives an item if she has passed previous chances, we obtain .
Now we are ready to prove Theorem 3.3.
Proof (Proof of Theorem 3.3)
Let be an agent, be a permutation, and be an index such that . Also, let with . Note that for every . By Lemma 1, we have
By summing up (20) for all with , we have
Finally, by summing up for all , and , we obtain
Here, the second last equality holds because, for any with ,
Algorithm 1 works even if the upper bound of valuations is more than one and the algorithm does not know the upper bound. Let . Note that is a negative integer if . Then, by summing up (20) for all types with for all , we can obtain
for each . Note that this bound is also useful for the case when because it implies a better guarantee.
One may expect to design better performing algorithms by dynamically changing the value according to the current objective value. However, such a method does not work for the online max-min fair allocation problem. In fact, if an agent values for the items that come for a while at first, we essentially need to solve the problem for the other agents with a static .
Finally, we discuss the difference between our results and the results of Benade et al . Recall that their deterministic algorithm outputs an allocation such that . This implies for each , and hence . However, their algorithm has two drawbacks compared to ours. One is that the additive term can be quite large compared to , for example, when the values of most items are almost zero for everyone. The other one is that their algorithm requires fine-tuned parameters that depend on the number of items and an upper bound on the value of the items. In contrast, our algorithm can be run independently of the number of items and the upper bound of the value of items, and our evaluation is independent of .
4 Algorithm for i.i.d. Arrival
In this section, we provide an algorithm for the i.i.d. arrival model, i.e., the value vector of each item is drawn independently from a given distribution . We assume that the distribution and the total number of items are unknown to the algorithm.
For the strict competitive ratio, we can carry Theorem 3.2 for this case. In what follows, we will analyze the asymptotic case.
One may expect that the round-robin procedure works well, but unfortunately it does not because, even if and is a distribution that takes with probability , the optimal value is but the round-robin can achieve only . We provide a simple algorithm that is asymptotically near-optimal. Let be a fixed small constant. When a new item arrives, our algorithm virtually discounts its value for each agent by a factor , where is the set of items allocated to so far. Then, the algorithm allocates the item to the agent with the highest among discounted values, i.e., . The discount factor leads to give a priority to an agent who has small utility at the moment. We formally describe our algorithm in Algorithm 2. Note that the algorithm can be viewed as an application of the multiplicative weight update method , which is used to solve the experts problem. However, the goals of the experts problem and the online max-min fair allocation problem are different, and no direct relationship can be found between them. In addition, our algorithm does not use the information about the number of items, unlike the allocation algorithm given by Devanur et al. .
For any positive , Algorithm 2 is -competitive if the expected optimal value is at least .
We prepare to prove Theorem 4.1. We evaluate the performance of Algorithm 2 by using a linear programming problem that gives an upper bound of the optimal value. For any realization of an input sequence , the optimal value is equivalent to the optimal value of the following integer linear programming: