Let be a set of players. Let be a set of indivisible resources. Resource is worth a non-negative integer value for player . An allocation is a partition of into disjoint subsets so that player is assigned the resources in . The max-min fair allocation problem is to distribute resources to players so that the minimum total value of resources received by any player is maximized. We define the value of an allocation to be . Equivalently, we want to find an allocation with maximum value.
Bezáková and Dani  attacked the problem using the techniques of Lenstra et al.  for the min-max version: the problem of scheduling on unrelated machine to minimize makespan. Bezáková and Dani proved that no polynomial-time algorithm can give an approximation ratio less than 2 unless P NP. However, the assignment LP used in  cannot be rounded to give an approximation for the max-min allocation problem because the integrality gap is unbounded. Later, Bansal and Sviridenko  proposed a stronger LP relaxation, the configuration LP, for the max-min allocation problem. They showed that although the configuration LP has exponentially many constraints, it can be solved to any desired accuracy in polynomial time. They also showed that there is an integrality gap of . Asadpour and Saberi  developed a polynomial-time rounding scheme for the configuration LP that gives an approximation ratio of . Saha and Srinivasan  improved it to . Chakrabarty, Chuzhoy, and Khanna  showed that an -approximate allocation can be computed in time for any .
In this paper, we focus on the restricted max-min fair allocation problem. In the restricted case, each resource is desired by some subset of players, and has the same value for those who desire it and value 0 for the rest. Even in this case, no approximation ratio better than 2 can be obtained unless P NP . Bansal and Sviridenko  proposed a polynomial-time -approximation algorithm which is based on rounding the configuration LP. Feige  proved that the integrality gap of the configuration LP is bounded by a constant (large and unspecified). His proof was made constructive by Haeupler et al. , and hence, a constant approximation can be found in polynomial time. Adapting Haxell’s techniques for hypergraph bipartite matching , Asadpour et al.  proved that the integrality gap of the configuration LP is at most 4. Therefore, by solving the configuration LP approximately, one can estimate the optimal solution value within a factor of in polynomial time for any constant . However, it is not known how to construct a -approximate allocation in polynomial time. Inspired by the ideas in  and , Annamalai et al.  developed a purely combinatorial algorithm that avoids solving the configuration LP. It runs in polynomial time and guarantees an approximation ratio for any constant . Nevertheless, the analysis still relies on the configuration LP. There is quite a gap between the current best estimation ratio333It was recently improved to independently in [8, 14] and the current best approximation ratio . This is an interesting status that few problems have.
If one constrains the restricted case further by requiring for some fixed constant , then it becomes the -restricted case. Golovin proposed an -approximation algorithm for this case . Chan et al.  showed that it is still NP-hard to obtain an approximation ratio less than 2 and that the algorithm of Annamalai et al.  achieves an approximation ratio of in this case. The analysis in  does not rely on the configuration LP.
We propose an algorithm for the restricted max-min fair allocation problem that achieves an approximation ratio of for any constant . It runs in polynomial time for any constant chosen. Our algorithm uses the same framework of Annamalai et al. : we maintain a stack of layers to record the relation between players and resources, and use lazy update and a greedy strategy to achieve a polynomial running time.
Let be the optimal solution value. Let be the target approximation ratio. To obtain a -approximate solution, the value of resources a player need is . Our first contribution is a greedy strategy that is much more aggressive than that of Annamalai et al. . Their greedy strategy considers a player greedy if that player claims at least worth of resources, which is more than needed. In contrast, we consider a player greedy if it claims (nearly) the largest total value among all the candidates. When building the stack, as in , we add greedy players and the resources claimed by them to the stack. Intuitively, our more aggressive greedy strategy leads to a faster growth of the stack, and hence a significantly smaller approximation ratio can be achieved.
Our aggressive strategy brings challenge to the analysis that previous approaches [1, 7] cannot cope with. Our second contribution is a new analysis tool: an injection that maps a lot of players in the stack to their competing players who can access resources of large total value. Since players added to the stack must be greedy, they claim more than their competing players. Therefore, such an injection allows us to conclude that players in the stack claim large worth of resources. By incorporating competing players into the analysis framework of Chan et al. , we improve the approximation ratio to . Our analysis does not rely on the configuration LP, and it is purely combinatorial.
Let be the optimal solution value. Let denote our target approximation ratio. Given any value , our algorithm returns an allocation of value in polynomial time. We will show how to combine this algorithm with binary search to obtain an allocation of value at least in the end. We assume that is no more than in the rest of this section.
2.1 Fat edges, thin edges and partial allocations
A resource is fat if , and thin otherwise. For a set of thin resources, we define . For any player , and any fat resource that is desired by , is a fat edge. For any player , and any set of thin resources, is a thin edge if desires all the resources in and . For a thin edge , we say player and the resources in are covered by , and define . We use uppercase calligraphic letters to denote sets of thin edges. Given a set of thin edges, we say covers a player or a thin resource if some edge in covers that player or resource, and define to be the total value of the thin resources covered by . That is, .
Since our target approximation ratio is , a player will be satisfied if it receives either a single fat resource it desires, or at least worth of thin resources it desires. Hence, it suffices to consider allocations that consist of two parts, one being a set of fat edges and the other being a set of thin edges.
Let be the bipartite graph formed by all the players, all the fat resources, and all the fat edges. We will start with an arbitrary maximum matching of (which is a set of fat edges) and an empty set of thin edges, and iteratively update and grow and into an allocation that satisfies all the players. We call the intermediate solutions partial allocations and formally define them as follows.
A partial allocation consists of a maximum matching of and a subset of thin edges such that (i) no two edges in and satisfy (i.e., cover) the same player, (ii) no two edges in share any resource, (iii) every edge is minimal in the sense that every proper subset has value less than .
In Section 3, we present an algorithm which, given a partial allocation and an unsatisfied player , computes a new partial allocation that satisfies and all the players that used to be satisfied. Repeatedly invoking this algorithm returns an allocation that satisfies all the players.
2.2 A problem of finding node-disjoint paths
We define a family of networks and a problem of finding node-disjoint paths in these networks. These networks and the node-disjoint paths problem are used heavily in our algorithm and analysis.
2.2.1 The problem
Recall that is a bipartite graph formed by all the players, all the fat resources, and all the fat edges. With respect to any maximum matching of , we define to be a directed bipartite graph obtained from by orienting edges of from to if the edge is in , and from to if is not in . See Figure 1(a) and (b) for an example.
We use and to denote the subsets of players matched and unmatched in , respectively. Given and , we use to denote the problem of finding the maximum number of node-disjoint paths from to in . This problem will arise in this paper for different choices of and . A feasible solution of is just any set of node-disjoint paths from to in . An optimal solution maximizes the number of such paths. Let denote the size of an optimal solution of . In the cases that , a feasible solution may contain a path from a player to itself, i.e., a path with no edge. We call such a path a trivial path. Any path with at least one edge is non-trivial.
Let be any feasible solution of . The paths in originate from a subset of , which we call the sources, and terminate at a subset of , which we call the sinks. We denote the sets of sources and sinks by and , respectively. A trivial path has only one node which is both its source and sink. From now on, we use to denote the subset of non-trivial paths in .
2.2.2 Solving the problem
An optimal solution of can be found by solving a maximum - flow problem. Let be the - flow network obtained from by adding a super source and directed edges from to all vertices in , adding a super sink and directed edges from all vertices in to , and setting the capacities of all edges to 1. It suffices to find an integral maximum flow in . The paths in used by this maximum flow is an optimal solution of . Node-disjointness is guaranteed because, in , every player has its in-degree at most one and every resource has its out-degree at most one.
Figure 1 gives an example. The squares represent players. The circles represent fat resources. In (a), the bold undirected edges form the maximum matching . The two lower square nodes are unmatched and they form . The two upper nodes are matched and they form . An optimal solution of can be computed by finding an integral maximum - flow in the network in (d). The shaded nodes and bold edges in (d) form a maximum - flow. If you ignore , , and the edges incident to them, the remaining shaded nodes and bold edges form an optimal solution of , which contains one trivial path and one non-trivial path.
2.2.3 Non-trivial paths and the operator
Let be a non-trivial path from to in . If we ignore the directions of edges in , then is called an alternating path in the matching literature : the first edge of does not belong to , every other edge of belongs to , and has an even number of edges. We use to denote the result of flipping , i.e., removing the edges in from the matching and adding the edges in to the matching. is a maximum matching of . Moreover, is unmatched in but it becomes matched in , and is matched in but it becomes unmatched in .
We can extend the above operation to any set of node-disjoint non-trivial paths from to in . can be regarded as a set of edges. We can form as in the previous paragraph, i.e., ignore the directions of edges in , remove the edges in from the matching, and add the edges in to the matching. is a maximum matching of . Players in are unmatched in but they become matched in , and players in are matched in but they become unmatched in .
2.2.4 Feasible solutions of
The preliminary background above are sufficient for understanding how our algorithm works. However, in order to carry out a rigorous analysis, we have to delve into the feasible solutions of .
First let’s discuss the operation further. Let be any set of node-disjoint non-trivial paths from to in . is a maximum matching of . Now consider , the directed bipartite graph defined for the maximum matching as in section 2.2.1. We claim that can be interpreted as a graph obtained from by reversing the edges used by : when edges in are removed from the matching and edges in are added to the matching, their counterparts in are reversed. Figure 2 gives an example. In (a), the bold edges form a maximum matching . (b) shows , and consists of the two bold paths, one from to and the other from to . In (c), the bold edges form a maximum matching which is obtained from by flipping the edges in . is shown in (d). Comparing (b) and (d), it is easy to see that can be obtained from by reversing the edges in .
Now we are ready to establish a few properties for feasible solutions of that will be used later in the analysis of our algorithm. As we explained in section 2.2.2, computing an optimal solution of can be reduced to computing an integral maximum - flow. Consequently, feasible solutions of have some properties that are similar to those in the max-flow literature. Claims 2.1, 2.2 and 2.3 are very much like testing the optimality of a flow, augmenting a flow, and rerouting a flow, respectively. Recall that for a set of node-disjoint paths, is the subset of non-trivial paths in .
Let be a feasible solution of . is an optimal solution of if and only if contains no path from to .
Let be the - flow network constructed from as in section 2.2.2. One can extend to a flow in with value by pushing unit flows from to , along , and then from to . Denote this flow by . Let be the residual graph of with respect to . can be obtained from by reversing the edges used by . Recall that can be obtained from by reversing the edges in . Hence, if you ignore the , , and the edges incident to them, the remaining of the residual graph is exactly . If there is a path in from to , then the concatenation is a path in the residual graph . This means that we can augment using to increase the flow value and obtain more node-disjoints paths from to in . (The augmentation may produce some unit-flow cycle(s) in , and such cycles can be simply ignored when extracting the node-disjoint paths in from to .) If such a path does not exist, then is a maximum flow which proves the optimality of . ∎
Let be a feasible solution of . Suppose that contains a path from to . We can use to augment to a feasible solution of such that , the vertex set of is a subset of the vertices in , , and .
Figure 3 illustrates the proof of Claim 2.1. The maximum matching consists of the bold edges in (a). In (b), the bold edges form an - flow . The bold edges other than those incident to and form a feasible solution of , which consist of a single path from to . Note that in this case. The residual graph of the flow network in (b) with respect to is shown in (c). If we ignore and in (c), the subgraph is exactly . The bold edges form an augmenting path , where is a path in . In (d), the bold edges form an - flow which is obtained from by augmenting along . naturally induces a set of two node-disjoint paths from to in : one trivial path from to itself and one non-trivial path from to . The cycle in is ignored. One can check that and satisfy Claim 2.2.
Let be a feasible solution of . Suppose that there is a non-trivial path in from to . Then it must be that , and we can use to convert to another feasible solution of such that , the vertex set of is a subset of the vertices in , , and .
Every node in is unmatched in , and hence has zero in-degree in . Therefore, they cannot be the sink of a non-trivial path in . .
As in the proof of Claim 2.1, let be the - flow network constructed from , let be the flow in corresponding to , and let be the residual graph of with respect to . Since is a subgraph of , the path is also a path in from a player in to a player in . Since , there is an edge directed from to in . Since and , there is an edge directed from to in . Therefore, is a cycle in . We update to another flow by sending a unit flow along . After removing all cycle(s) of flows in and removing all edges in incident to and , we obtain a set of node-disjoint paths from to in .
Since sending flow around a cycle does not change the total flow, the values of and are equal, implying that . By sending the unit flow around , we do not update the flow on directed edges incident to in . Thus, every player who received a unit flow from before the update still receives a unit flow from afterwards, so . Since we push a flow from to , no longer sends a unit flow to in , and is no longer a sink after the update. As we push a flow from to , becomes a new sink. All the other sinks are not affected. We conclude that . ∎
Figure 4 gives an example of the proof of Claim 2.3. In (a), consists of the bold edges. In (b), the bold edges form a flow . The bold edges other than those incident to and form a feasible solution of consisting of two paths: one from to and the other from to . Note that . The residual graph of the flow network in (b) with respect to is shown in (c). The subgraph of the residual graph that excludes and is exactly . The bold edges form a cycle where is a path in . In (d), the bold edges form an - flow which is obtained from by pushing a unit flow along . induces a set of two node-disjoint paths from to in : a trivial one from to itself and a non-trivial one from to . The cycle in is ignored. and satisfy Claim 2.3.
2.2.5 More properties
We derive some relations between ’s for different choices of , , and .
For any maximum matchings and of ,
for every subset of players, .
We first prove (i). Consider the symmetric difference . It consists of cycles and alternating paths of even lengths . All these alternating paths are node-disjoint and appear as directed paths in . Since these paths have even lengths, they are either from players to players or from resources to resources. Any node in (i.e., matched by but not by ) must be an endpoint of some alternating path, and the other endpoint of the path must be a node in (i.e., matched by but not by ). Any node in has no incident edge in , so it is a trivial path. Putting things together, there are node-disjoint paths (trivial or non-trivial) in from all nodes in to . So .
Next we prove (ii). Let be an optimal solution of . Let be the maximum matching obtained from by flipping the alternating paths in , i.e., . After flipping the alternating paths, players in become matched and players in become unmatched. Thus, . The last step is due to the fact that each trivial path is a single vertex in which serves as a source and a sink simultaneously. So . By (i), , which implies that (Just take an optimal solution of and delete the paths ending at ). As by definition, . Recall that is an optimal solution of , so . We can similarly prove the other direction that . ∎
Claim 2.5 below states that if adding a player to increases , adding to any subset increases too.
Let be a maximum matching of . Let be any subset of . Let be any subset of . Let be an arbitrary player in . If , then for every , .
Let be an optimal solution of . Note that is also a feasible solution of . Let be an optimal solution of obtained by augmenting (using Claim 2.2). Then, . If , then , implying that there are node-disjoint paths from to , and thus establishing the claim. If , then is a feasible solution of . But then , a contradiction to the assumption. ∎
3 The Algorithm
In this section, we present an algorithm which, given a partial allocation and an unsatisfied player , computes a new partial allocation that satisfies and all the players that used to be satisfied. Recall that a partial allocation consists of a maximum matching of and a subset of thin edges such that (i) no two edges in and satisfy (i.e., cover) the same player, (ii) no two edges in share any resource, (iii) every edge is minimal in the sense that every proper subset has value less than .
Let and be the maximum matching of and the set of thin edges in the current partial allocation, respectively. Let be an arbitrary player who is not yet satisfied.
To satisfy , the simplest case is that we can find a minimal thin edge such that excludes all the resources covered by . (Recall that by definition of thin edges, .) We can extend the partial allocation by adding to .
More generally, we can use any thin edge such that meets the above requirements even if , provided that there is a path from to in . If , such a path is an alternating path in with respect to , and is matched by . We can flip this path to match with a fat resource and then include in to satisfy .
The thin edge mentioned above may not always exist. In other words, some edges in may share resources with . Let be such thin edges in . In this situation, we say is blocked by . In order to free up the resources held by and to make unblocked, we need to satisfy each player with resources other than those in . Afterwards, we can satisfy as before. To record the different states of the algorithm, we initialize a stack to contain as the first layer and then create another layer on top that stores the sets and among other things for bookkeeping. We change our focus to satisfy the set of players .
To satisfy a player in (by a new edge), we need to identify a minimal thin edge such that excludes the resources already covered by thin edges in the current stack because we don’t want to block or be blocked by any edges in the current stack. As to , we require that contains two node-disjoint paths from to . If is blocked by some thin edges in , we initialize a set ; otherwise, we initialize a set . Ideally, if is unblocked, we could immediately make some progress. Since there are two node-disjoint paths from to , is reachable from either or a player in . In the former case, we can satisfy . In the latter case, the path from to must be node-disjoint from the path from to , so we can satisfy some player without affecting the alternating path from to , and free up the resources previously held by that player. But we would not do so because, as argued in , in order to achieve a polynomial running time, we should let grow bigger so that a larger progress can be made at once.
Since there are multiple players in to be satisfied, we continue to look for another minimal thin edge . We require that excludes the resources covered by thin edges in the current stack (including and ), and that contains one more node-disjoint paths from after adding to the destination set. If is blocked by some thin edges in , we add to ; otherwise, we add it to . After collecting all such thin edges in and , we construct the set of thin edges in the current partial allocation that block . Then, we add a new top layer to the stack that stores and among other things for bookkeeping. Then, in order to free up the resources held by edges in and to make edges in unblocked, we turn our attention to satisfying the players covered by with new edges and so on. These repeated additions of layers to the stack constitute the build phase of the algorithm.
The build phase stops when we have enough thin edges in to satisfy a predetermined fraction of players covered by for some . We shrink the -th layer and delete all layers above it. The above is repeated until is not large enough to satisfy the predetermined fraction of players covered by any in the stack. These repeated removal of layers constitute the collapse phase of the algorithm. At the end of the collapse phase, we switch back to the build phase.
The alternation of build and collapse phases continues until we succeed in satisfying player , our original goal, which is stored in the bottommost layer in the stack.
The lazy update (i.e., wait until is large enough before switching to the collapse phase) is not sufficient for achieving a polynomial running time. A greedy strategy is also needed. In , when a blocked thin edge is picked and added to for some , is required to be a minimal set of value at least , which is more than . Intuitively, if such an edge is blocked, it must be blocked by many edges. Hence, the strategy leads to a fast growth of the stack. We use a more aggressive strategy: we allow the value of to be as large as , and among all candidates, we pick the thin edge with (nearly) the largest value. Our strategy leads to a faster growth of the stack, and hence, a polynomial running time can be achieved for a smaller .
3.2 Notation and definitions
Let and denote the maximum matching of and the set of thin edges, respectively, that are used in the current partial allocation. Let denote the next player we want to satisfy.
A state of the algorithm consists of several components, namely, , , a stack of layers, and a global variable that stores a set of unblocked thin edge. The layers in the stack are indexed starting from 1 at the bottom. For , the -th layer is a 4-tuple , where and are sets of thin edges, and and are two numeric values that we will explain later. We use , and to denote the sets of players covered by edges in , and , respectively. The set grows during the build phase and shrinks during the collapse phase, and changes correspondingly. The same is true for , , , and . For any , let denote . , , and are similarly defined.
The sets and are defined inductively. At the beginning of the algorithm, , , , and . The first layer in the stack is thus .
Let be the index of the topmost layer in the stack. Consider the construction of the -th layer in an execution of the build phase. When it first starts, is initialized to be empty. We say that a player is addable if
Note that this definition depends on , so adding edges to and may affect the addability of players.
Given an addable player , we say that a thin edge is addable if
An addable thin edge is unblocked if there exists a subset such that and excludes resources used in . Otherwise, is blocked. During the construction of the -th layer, the algorithm adds some blocked addable thin edges to and some unblocked addable thin edges to . When the growth of stops, the algorithm constructs as the set of thin edges in that share resource(s) with some edge(s) in .
After constructing and and growing , the values and are defined as
The values and do not change once computed unless the layer is destructed in the collapse phase. That is, and record the values and at the time of construction. (Note that and may change subsequently.) The values and are introduced only for the analysis. They are not used by the algorithm.
Whenever we complete the construction of a new layer in the stack, we enter the collapse phase to check whether any existing layer is collapsible. If so, shrink the stack and update the current partial allocation ( and ). We stay in the collapse phase until no layer is collapsible. If the stack has become empty, we are done as the player has been satisfied. Otherwise, we reenter the build phase. We give the detailed specification of the build and collapse phases in the following subsections.
3.3 Build phase
Let be the index of the topmost layer in the stack. Let and denote the maximum matching in and the set of thin edges in the current partial allocation, respectively. We call the following routine Build to construct the next layer .
Initialize to be the empty set.
If there is an addable player and an unblocked addable edge , then:
take a minimal subset such that and excludes the resources covered by (we call a minimal unblocked addable edge),
add to ,
repeat step 2.
When we come to step 3, no unblocked addable edge is left. If there is no (blocked) addable edge, go to step 4. For each addable player who is incident to at least one addable edge, identify one maximal blocked addable edge such that for any blocked addable edge . Among the maximal blocked addable edges identified, pick the one with the largest value, and add it to . Then repeat step 3.
At this point, the construction of is complete. Let be the set of the thin edges in that share resource(s) with some thin edge(s) in .
Compute and .
Push the new layer onto the stack. .
Build differs from its counterpart in  in several places, particularly in step 3. First, we require blocked addable edges to be maximal while only minimal addable edges of value at least are considered in . Second, when adding addable edges to , we pick the one with (nearly) the largest value. In contrast, one arbitrary addable edge is picked in .
One may wonder, instead of identifying a maximal blocked addable edge for each player, whether it is better to identify a maximum blocked addable edge (i.e., the blocked addable edge with the largest value). However, finding the blocked addable edge with the largest value for is an instance of the NP-hard knapsack problem. Maximal blocked addable edges are sufficient for our purposes.
Is it possible that and so ? We will establish the result, Lemma 4.1 in Section 5.2, that if for some , then some layer below is collapsible. Therefore, if is empty, then some layer below must be collapsible, the algorithm will enter the collapse phase next, and will be removed.
Build runs in time.
It suffices to show that steps 2 and 3 run in polynomial time. Two maximum flow computations tell us whether a player is addable. Suppose so. We start with the thin edge where . Let denote the set of thin resources that are desired by . First, we incrementally insert to thin resources from that appear in neither the current partial allocation nor . If becomes greater than or equal to , then we must be in step 2 and is a minimal unblocked addable edge that can be added to . Suppose that after the incremental insertion stops. Then, has no unblocked addable edge. If we are in step 3, we continue to add to thin resources from that appear in the current partial allocation but not in . If when the incremental insertion stops, then has no addable edge. Otherwise, we continue until is about to exceed or we have examined all thin resources in , whichever happens earlier. In either case, the final is in the range and is a maximal blocked addable edge. ∎
Table 1 shows the invariants that will be used in the analysis of the algorithm. Clearly, they all hold at the start of the algorithm, i.e., , , , , and . We show that they are maintained by Build.
Build maintains invariants 1–7 in Table 1.
Suppose that the invariants hold before Build constructs the new topmost layer . It suffices to check the invariants after the construction of . Invariants 1 and 2 are clearly preserved by the working of Build.
Consider invariant 3. It holds for because none of ,