We consider a model with a finite set of agents who can be repeatedly matched in each of a finite number of rounds. Each pair of agents
is compatible with a known probability. When a pair is first matched, their compatibility is realized, and is successful with probability and unsuccessful with probability
. This compatibility persists through all future rounds. This model captures learning dynamics in platforms that match workers with repeated tasks (dog-walking, babysitting, private chefs), mentorship programs, and kidney exchange. In such models, agents may only have an estimate of their compatibility with potential partners and typically learn their compatibility with others through being matched.
The goal of a matching platform is to maximize a weighted sum of the size of the matching in each round. For example, a platform may want to maximize the total number of successful matches, such as the total number of dog-dogwalker days or babysitting events. Or, it may only be interested in the number of successfully matched pairs by the end of the matching process. In mentorship and kidney exchange programs agents delays are costly, so matches made earlier in the process may be more valuable than matches made later. In contrast, the goal of each agent is to selfishly maximize the number of rounds in which they are matched.
Consider two matching processes. The optimal centralized matching algorithm, OPT, solves an NP-hard Bayesian optimization problem to maximize the expected reward, by prescribing matchings adaptively across rounds. In contrast, in the Stable Matching or SM process, the pairs that were successfully matched to each other in previous rounds remain matched and the rest of the matching is formed by pairing the agents with the highest success probability in a greedy fashion. Ties are broken arbitrarily. We show that SM provides a constant-factor approximation of OPT.
Theorem 1 ().
The expected reward of SM is at least a 0.316-approximation of the expected reward of the optimal online algorithm.
In Section 2, we provide a more detailed explanation of the SM process and its relationship with stable matching. Informally, if agents form preference lists over all other agents based on probability of compatibility, then SM forms a stable matching with respect to these preferences.
Stable matching has a number of attractive properties. While the optimal centralized matching OPT is NP-hard to compute, stable matching can be computed in polynomial time and reached in a decentralized manner. Additionally, also in contrast with OPT, stable matching is incentive compatible: it does not match agents against their will by forcing unmatched agents with high compatibility to match with low-compatibility partners, or break up compatible matches in future rounds. Stable matching is also attractive in settings such as kidney exchange or mentor matching where time is valuable, as it does not sacrifice present potential matches for future payoffs.
Our analysis implies a better approximation factor for another previously-known matching algorithm, known as Greedy-Commit(chen2009approximating). Similar to SM, Greedy-Commit commits to known compatible pairs by matching them in all future rounds, but unlike SM, it selects a maximum weight matching between the remaining agents to maximize the immediate expected reward. We can interpret Greedy-Commit as an off-the-shelf centralized matching service that can be computed by the platform in polynomial time if it knows the compatibility probabilities but does not want to solve the optimal online repeated matching problem. We show that Greedy-Commit provides a better constant-factor approximation to OPT.
Theorem 2 ().
The expected reward of Greedy-Commit is at least a 0.43-approximation of the expected reward of the optimal online algorithm.
Finally, we consider settings where agents have the capacity to be bilaterally matched with multiple other agents, or grouped in teams. Many gig economy markets and matching settings are built on many-to-one bipartite markets, where agents on one side (e.g. workers) can be matched to multiple agents on the other side of the bipartition (e.g. tasks). In this setting SM provides a -approximation to OPT. In the general matching problem where all agents can be matched to multiple other agents, SM provides a -approximation to OPT, and in the teams problem where we form teams of up to agents, SM provides a -approximation to OPT.
Together, our results contribute to the discussion of the benefits of investing in optimal centralized matching. We show if the platform allows myopic, self-interested agents to determine a decentralized matching, the outcome attains a constant factor of the expected reward of the optimal online matching. If the platform offers a centralized matching service, in addition to determining match probabilities it must also force agents into matches that are not incentive-compatible, and may even need to solve an NP-hard problem.
1.1. Technical Ingredients
A key technical contribution of this paper is to compare the decentralized stable matching process to the optimal online algorithm for centralized repeated matching. In the study of online Bayesian optimization, an extremely common benchmark is the optimal offline algorithm, which knows at the outset which edges are successful and unsuccessful. In our setting, optimal offline is not an interesting benchmark, because for a certain family of inputs any online algorithm achieves an arbitrarily poor approximation to the optimal offline algorithm (see Appendix A).
We compare the stable matching SM with the optimal online algorithm OPT by coupling edges selected by OPT with those selected by SM. Since the two algorithms uncover different information as they progress, care must be taken in the coupling process not to condition analysis of the reward achieved by one algorithm on information that is acquired by the other.
A pivotal piece of our analysis is a Domination Lemma, which uses the greedy structure of the stable matching to bound the reward generated by a subset of edges selected by OPT by twice the reward generated by SM. The subset of edges is carefully chosen so that SM approximately improves upon OPT, despite the fact that OPT is able to use information from prior rounds in a more sophisticated manner. Specifically, note that SM greedily selects max-weight edges between agents who have not yet been matched. Now fix a round , and consider the edges selected by OPT in round that do not share an endpoint with successful edges previously selected (i.e. selected in rounds ) by SM. Such edges can also be selected by SM in round , and so the greediness of SM implies that the expected number of such edges that are successful is bounded by the twice the expected number of successful edges newly selected by SM in round . While the intuition behind the Domination Lemma is straightforward, care needs to be taken to ensure that expectations are taken appropriately, since OPT and SM explore different parts of the sample space and so their performance is conditional on different histories.
We bound the remaining edges using a Charging Lemma that charges edges selected by OPT to adjacent edges selected by SM. For matching on general graphs, the bounds provided by the Charging Lemma and a refined version of the Domination Lemma define a factor-revealing LP that yields the 0.316-approximation. In the other settings, generalized versions of the Charging Lemma and Domination Lemma show the constant-factor approximations.
Our analysis also provides some intuition for why stable matching, which utilizes a myopic, decentralized matching process in each round, and which limits its use of information by committing to successful matches, is nonetheless able to achieve a constant-factor approximation to the optimal online algorithm, which can select optimal matchings in each round, and can also adaptively make use of information across rounds. In the uncapacitated matching setting, the Domination Lemma shows that in any given round, agents who have not previously been matched can match themselves via stable matching at least half as well as OPT, and the Charging Lemma observes that in any given round, agents who are matched in that round via stable matching are collectively matched at least half as well compared to OPT. Therefore stable matching cannot do much worse than OPT, even though it makes use of information in a much less sophisticated manner.
1.2. Relationship with Prior Work
There is a large body of work on centralized and decentralized matchings in online and two-sided platforms. Most of this literature focuses on designing optimal information structures in centralized matchings, or optimal procedures for centralized matching by the platform. Our question also motivates a variant of repeated stochastic matching that is related to existing literature on the query-commit problem and stochastic matching with rewards. Interestingly, our analysis is sufficiently general to also handle previously studied centralized algorithms in the query-commit setting, and consequently we are able to provide improved bounds on prior algorithms.
Optimal matching in matching platforms
There is also a substantial literature on finding optimal or near-optimal matching policies in two-sided platforms with long-lived agents, such as matching in ridesharing (gurvich2015dynamic; banerjee2015pricing; banerjee2016pricing; hu2020dynamic), volunteer platforms (manshadi2020online) and blood donation (mcelfresh2020matching). We focus on such platforms where match compatibility is difficult to determine and can be learned exactly only through matching. A well-known example of such a setting is kidney exchange, which has been studied from a repeated matching perspective (ashlagi2019matching; akbarpour2020thickness) and a failure-aware perspective (dickerson2013failure).
Online and Stochastic Matching.
The online bipartite matching problem introduced in (karp1990optimal), where vertices on one side of a bipartite graph arrive online, is foundational to the literature on online matching problems. Many variations have been studied, including the adwords problem (mehta2007adwords; goel2008online), matching with stochastic rewards (mehta2012online; mehta2014online; goyal2019online; huang2020online) (where edge realizations are stochastic) and the random-order and i.i.d. online matching problems (devanur2009adwords; feldman2009online; korula2009algorithms; karande2011online; mahdian2011online; manshadi2012online; kesselheim2013optimal; jaillet2014online). While almost all of these papers provide competitive guarantees compared to the optimum offline algorithm, we provide guarantees compared to the optimum online algorithm in a multi-round environment.
Motivated in part by the application of kidney exchange, there is a large body of literature studying a variant of stochastic matching known as the query-commit problem (chen2009approximating; adamczyk2011improved; bansal2012lp; molinaro2011query; goel2012matching; gupta2017adaptivity; gamlath2019beating). Here, edges have fixed realization probabilities and can be queried in sequence, with the constraint that successful edges queried must be accepted, and accepted edges must form a matching. The objective is to maximize the size of the resulting matching. (chen2009approximating) introduced this setting and proved that the greedy algorithm (which simply queries feasible edges in decreasing order of probability) is a -approximation to the optimal online algorithm even when vertices have patience parameters, i.e. limits on how many incident edges can be queried. The authors additionally showed that the optimal online algorithm is NP-hard to compute. (adamczyk2011improved) improved on this result by showing that the greedy algorithm is in fact a -approximation (with patience parameters), and (bansal2012lp) provided constant-factor approximations on weighted graphs through an LP-based approach. Later work provided constant-factor approximations in query-commit settings for feasibility constraints significantly generalizing the matching constraint (e.g., (gupta2013stochastic), (gupta2016algorithms), (adamczyk2016submodular)).
A less common assumption in the query-commit literature is the ability to query a matching in each round, instead of a single edge. (chen2009approximating) define this setting, propose the Greedy-Commit algorithm, and prove that it achieves a -approximation to the optimal online algorithm which is forced to commit. Their analysis can furthermore be extended to a capacitated matching setting, where in each round we can select a matching that must respect integral capacity constraints on each vertex. (bansal2012lp) also prove a -approximation in the weighted setting, where the constraint is that a matching of size at most can be queried in each round (for any parameter ).
The main differences between this line of work and our contribution are two-fold. Most importantly, while Greedy-Commit operates globally in each round, the decentralized matching process we analyze cannot replicate such a centralized greedy approach. Secondly, we compare to algorithms that are not forced to commit to successful edges. (If OPT-Commit denotes the optimal online algorithm that is forced to forever match any successful edges it finds, simple examples show that OPT is a more powerful algorithm than OPT-Commit; see Appendix C .) Because we compare to the optimal online algorithm that is not restricted to committing, our coupling argument is substantively different to, and more delicate than, the one in (chen2009approximating). Despite this challenge, our analysis of the decentralized setting still improves upon (chen2009approximating); we show greedy achieves a 0.316-approximation against a non-committing (more powerful) OPT, tightening the previous analysis of a -approximation to committing OPT.
Learning by Matching.
Prior work has also explored several models of learning in matching settings. Many of these papers consider a multi-armed bandit setting where rewards are stochastic and redrawn i.i.d. from an unknown distribution, with algorithms of an explore-exploit nature (Channel_Allocations; johari2016matching; Bandits_Matching_Markets). Efficient algorithms that approximate the optimal online algorithm were also studied by (price_of_information) in a stochastic setting with a price for querying each edge, relying on a Gittins Index characterization for the optimal online algorithm (gittins1974dynamic; weitzman1979optimal).
2. Stable Matching in General Graphs
We begin with a setting where each agent is matched with at most one other agent in each round. This setting captures bipartite matching problems such as matching mentors and mentees, as well as matching in general graphs such as matching peer mentors, roommates and kidney exchange.
There is a set of vertices, representing agents. For every pair of agents with , there is an edge between and with probability independent from other edges, representing the compatibility of the agents. The set of agents and probabilities are known at the outset. We will find it useful to view the entire graph as being generated randomly from the outset: before the first round nature samples the graph with probability The platform and agents do not have direct access to . Instead, in rounds , they can determine a matching between the vertices in and observe whether the edges in are in or not.
be the Bernoulli random variable indicating whetheris in ; we say an edge is successful if . Each agent’s goal is to maximize the expected number of rounds in which they are successfully matched. The platform’s reward in each round is equal to the number of successful edges selected in that round. Given weights , the platform’s goal is to maximize the weighted sum of the rewards collected in all rounds. This can capture if the platform’s goal is to maximize the size of the matching in the last round (), maximize the total number of successful matches in each round ( for all ), or weighted toward favoring matches in earlier rounds (e.g. for ).
The (deterministic) optimal online algorithm for maximizing the total reward can be derived by an exponential-size dynamic program (bertsekas1995dynamic). However, exact optimization is NP-hard (as we discuss later), and for large numbers of agents it is hence infeasible for a matching platform to compute this algorithm. With this in mind, we focus on simple matching processes that are computable in polynomial time, and give approximations to the optimal online algorithm.
We assume that each agent knows his or her compatibility probabilities with every other agent . This could be enabled, for example, by search functionality in the platform that allows agents to view information about other agents. Fix a given round and an agent , and assume that through prior matches agent now has updated their priors with other agents to (i.e., all unsuccessful matches are updated to 0 and all successful matches are updated to 1). Agent wants to be matched as soon as possible, so in this round would like to match with the remaining agent who is most likely to be compatible with . Hence, agent can form a preference list over all other agents in this round by sorting their current compatibility probabilities .
In the SM process, we assume that in each round agents choose a matching with no blocking pair under these preferences (two agents who are not matched to each other but prefer each other to their matched partners). We additionally assume that once a pair of agents is successfully matched they will continue matching with each other in future rounds. It is straightforward to see that the decentralized stable matching, where there are no blocking pairs, can be formalized as follows.
Stable Matching (SM)
Initialize the set of successful edges . Let
For rounds :
While there are agents in who are compatible with positive probability, find the most compatible pair , match with , and remove from . Let be the matching determined once no agents in are compatible with positive probability.
Output as the selected matching for round .
If an edge in is successful add it to . Otherwise add both its endpoints back to and set .
In other words, the stable matching can be determined by greedily selecting pairs of agents who are most likely to be compatible, matching them, and committing to matching them in future rounds if their edge is successful. The stable matching is incentive-compatible within-rounds, in the sense that if in round some agent prefers matching with another agent to their match in then is matched to a preferred agent. The stable matching is also incentive-compatible across-rounds, in the sense that agents who are successfully matched in previous rounds prefer staying matched to matching again, and agents always prefer to be matched as soon as possible.
For additional clarity, consider when all compatibility probabilities are distinct and strictly smaller than 1. In this case, the stable matching in the first round is unique: the pair with the highest compatibility match with each other (otherwise they would form a blocking pair), the pair with the highest probability among the remaining agents match, and so on. No matter the realizations, in all rounds there is a unique stable matching based on the updated preferences. When some edge probabilities are the same the stable matching may not be unique, but our analysis applies no matter which stable matching is selected, as long as previously matched edges stay matched.
We compare the result of SM with a platform that optimizes its matching process centrally. One downside to this approach is that determining this optimal online algorithm (OPT) is NP-hard.
Proposition 0 ().
Computing the optimal online algorithm is NP-hard.
This was originally proved by Chen et al. (chen2009approximating);111(chen2009approximating) state the hardness result for a slightly different setting, where the optimal online algorithm is forced to commit to edges that are successful. Despite the fact that in our setting, OPT is not required to commit, the same proof holds. for completeness we provide a full proof for our setting in Appendix D. This hardness result motivates the analysis of matching algorithms which can be computed efficiently. Our main result is the following.
Theorem 1 ().
The expected reward of stable matching is at least a 0.316-approximation of the expected reward of the optimal online algorithm.
The proof of Theorem 1 relies on coupling edges selected by OPT with those selected by SM. A subset of these edges are bounded using a Domination Lemma, which uses the greediness of SM to bound the expected reward generated by a subset of edges selected by OPT by twice the expected reward generated by SM. This coupling requires a delicate comparison of two algorithms that may have, through different histories, discovered very different pieces of information about the sample graph . The remaining edges are bounded using a Charging Lemma that charges edges selected by OPT to adjacent edges selected by SM. The bounds provided by the Charging Lemma and the Domination Lemma define a factor-revealing LP that yields the 0.316-approximation. In Section 2.4, we additionally show an upper bound of on the approximation ratio that SM achieves to OPT.
2.1. Coupling Edges Selected by Stable Matching and OPT
The main technical innovation in our paper is to couple the edges selected by OPT with the edges selected by SM in a way that admits a constant-factor bound in expectation for each round despite the fact that SM and OPT learn different information in each round. To specify the subsets of edges that are coupled in the Charging and Domination Lemmas, we introduce some additional notation.
Throughout the paper, for a random set we will write as a shorthand for . We use to denote a union of sets that are disjoint.
For all , let denote the set of successful edges selected by SM in round .222Note is a random set that is a deterministic function of the sample graph . Also define to be the successful new edges that SM selects for the first time in round . We hence have
Similarly, let denote the set of successful edges that OPT selects in round .
Fix . To bound the expected number of edges in , we will partition these edges based on whether or not they can be added to . In particular, we write
where Aug denotes all the edges in that are vertex-disjoint from (and hence can augment this matching), and Adj denotes the remaining edges that share an endpoint with some edge in .333As and are deterministic functions of the sample graph , Aug and Adj are hence also random sets entirely determined by the sample graph. It will also be useful to categorize all edges in according to the round in which OPT first selected them; hence, we will write
where denotes all edges in Aug that OPT first selected in round . Similarly, we break up
where is all edges in Adj that OPT first selected in round . Note that Aug and Adj and their corresponding partitions are defined with respect to and and therefore they all depend on . We have dropped the index from their notation to simplify the exposition.
In the rest of this section, we will use the Domination Lemma to couple edges in with edges in , and the Charging Lemma to couple edges in with edges in . This coupling provides bounds that define the factor-revealing LP.
2.2. The Domination Lemma
Our main technical lemma is the Domination Lemma, which makes use of the greediness of SM to bound its approximation to OPT. In particular, recall that is the set of successful edges selected by SM in round . Then in each round , SM selects edges to add on to greedily. In this step, the Domination Lemma lower bounds its expected reward versus some of the edges selected by OPT.
Lemma 0 (Domination Lemma).
The expectations in the lemma are over all sample graphs. In the proof, we require a consideration of what each algorithm knows up to round , and through this split up the probability space the expectations are being taken over. To do so, we define the notion of a history induced on an algorithm, and analyze the expected size of and conditioned on these histories.
Fix some . For a sample graph and an algorithm , we define , the history that induces on , as follows. The history consists of the set of matchings selects in the first rounds, along with the observed outcomes of all the edges in these matchings, conditioned on the sample graph being . We next partition the probability space over sample graphs based on the distinct histories they induce on SM in the first rounds. In particular, for any possible history , let denote the set of all sample graphs such that . We use as a shorthand for the expected size of , conditioned on the sample graph being in , and use for the probability that our sample graph is an element of . Note then that
where the sum is over all possible histories . It hence suffices to show that for any history .
To do so, we analyze by partitioning based on the different histories these sample graphs induce on OPT. Specifically, for a fixed history , let denote all sample graphs such that and . Note that and therefore, to complete the proof of Lemma 3, it suffices to show that
for any possible histories , . The rest of the proof is devoted to this inequality.
Conditioned on our sample graph being in , let be the new edges selected by OPT in round that can augment the matching . As we mentioned earlier, we can assume that OPT is deterministic. So is uniquely determined by and . Also, by definition is disjoint from all edges in and all successful edges in .
Let denote those edges in that belong to neither nor . Because edges are independent, for each edge outside history or , the probability that a random sample graph in includes edge is exactly . Therefore, the expected number of edges in that are successful is given by as all edges in are guaranteed to be unsuccessful.
Furthermore, observe that the set of edges in can feasibly augment as every edge in is vertex-disjoint from , and the edges in form a matching disjoint from . Note that SM chooses edges greedily to augment ; as the greedy algorithm gives a -approximation to maximum weight matching, we hence have
Additionally, note that conditioned on our sample graph being in , is guaranteed to be a subset of the successful edges in , as edges augmenting must also augment . Therefore,
as desired. ∎
To obtain a bound of 0.316 on the approximation factor achieved by SM we provide a refinement of the Domination Lemma. To prove the Domination Lemma, roughly we showed that for any fixed round , if we look at the successful edges that OPT selects for the first time in round which can augment , the expected size of this set is no more than . As fits this description, this shows that . However, this lemma is loose; does not necessarily comprise all the edges that OPT selects for the first time in round which can augment . In particular, certain edges in might fit this description as well — they certainly cannot augment by definition, but for small values of many of them might be able to augment .
To refine the Domination Lemma, we categorize the edges in Adj according to which of , , , they are incident to; this is motivated by the goal of finding some edges in that can augment . In particular, we break up
where denotes the subset of Adj consisting of edges incident with but not . This is well-defined because the sets , , , are disjoint and form a matching. Similarly, for any fixed we break up
where is all edges in that OPT first selected in round .
The informal statement of the refined Domination Lemma is that if we fix a round , and look at all edges that OPT first discovered in round that can be added to , the expected size of this set is at most twice the expected size of . Our notation above lets us write this formally.
Lemma 0 (Refined Domination Lemma).
Fix a round . We have for all that
The proof of the Refined Domination Lemma only requires small changes from the proof of the Domination Lemma, and can be found in Appendix E.1.
2.3. Proof of Theorem 1
We next state and prove the Charging Lemma, related to the fact that two maximal matchings in a graph always have sizes within a factor of 2.
Lemma 0 (Charging Lemma).
Fix a round . For all ,
Fix a sample graph. Consider any edge . By definition, it is incident to at least one edge in ; charge to one of the edges in it is incident to. Note that because forms a matching, every edge in is charged at most twice. Hence the result holds sample graph by sample graph, as well as in expectation over all sample graphs.∎
The task remaining is to use the structure we have found to give an upper bound on
; we do so via a factor-revealing linear program, with constraints corresponding to the Refined Domination Lemma and the Charging Lemma.
Factor-revealing LP (Primal)
In this linear program, the variables correspond to , the variables correspond to , and the variables correspond to , and the bounds from the lemmas are encoded as constraints.
Given any instance of our matching problem, we can construct a feasible solution for this LP by setting:
Indeed, note that when we set the variables in this way, (1) directly states the Refined Domination Lemma, and (2) directly states the Charging Lemma. Also, because we have that (3) holds. For the suggested feasible solution, we can note that the objective simplifies to:
Hence, the maximum objective obtained by our LP gives a lower bound on the worst-case competitive ratio that SM achieves against OPT.
To give a bound on the maximum value obtained by this LP, we take the dual, noting that by weak duality it suffices to analyze the value obtained by a specific feasible solution. The dual of our LP is given below.
Factor-revealing LP (Dual)
Consider a feasible solution for the dual LP given by , and . 444This is the optimal Dual solution. Note that if we change our objective, we might be able to get better approximation factors; e.g. if we change the objective to be the cumulative reward, which sums the successful selected edges over every round instead of finding a round-by-round approximation guarantee, we will be able to achieve a 2.96-approximation factor. The objective value is increasing in and
Hence by weak duality, the objective value of our primal is at most . It follows that the expected reward of SM in round is at least a fraction of the expected reward of OPT in round . This completes the proof of Theorem 1.
2.4. Discussion: How much better can a matching platform do?
So far, we have shown that a decentralized stable matching process achieves at least a proportion of the reward of the optimal matching service. In this section, we provide some additional discussion on what else is achievable by the platform.
First, suppose the platform could invest in an off-the-shelf matching process. In particular, we consider the Greedy-Commit algorithm that was previously proposed by (chen2009approximating). The Greedy-Commit algorithm proceeds by proposing in the first round a matching of vertices in that maximizes the sum of edge probabilities, and in subsequent rounds committing to keeping all successful edges from the previous rounds, and augmenting them with a matching that maximizes the sum of edge probabilities in the remaining graph. We describe Greedy-Commit formally in Appendix B.
Note that Greedy-Commit is computable in polynomial-time, and is incentive-compatible across rounds but not within rounds. Hence Greedy-Commit can be thought of as a matching service that hides information and myopically dictates a maximum matching in each round, but cannot incentivize matched agents to be rematched in future rounds.
Chen et al. showed that Greedy-Commit gives a -approximation to OPT.555 (chen2009approximating) provide a proof for a slightly different result which works to prove Lemma 4; see footnote 1. As an extension to our other results in this section, we show that our techniques improve on the analysis of Chen et al. In particular, we show the following result.
Theorem 2 ().
The expected reward of Greedy-Commit is at least a 0.43-approximation to OPT.
The proof proceeds very similarly to the proof of the approximation factor for SM. The Charging Lemma continues to hold, and a stronger version of the Refined Domination Lemma holds, improved by a factor of 2. These constraints yield a different factor-revealing LP which we use to prove the claimed 0.43 factor. Details can be found in Appendix E.2.
Another natural question is what upper bound we can give on how well SM and Greedy-Commit can perform compared to the optimum online algorithm. Although computing OPT is NP-hard, perhaps one of them could be very close to optimal, in the sense of guaranteeing a -approximation to OPT for some small ? We show this is not the case; in particular, we show that both SM and OPT provide at best a -approximation to the optimal online algorithm.
Lemma 0 ().
The expected reward of SM and Greedy-Commit are at most a -approximation to OPT.
Consider the bipartite graph with bipartition edges , the edge has probability 1 of being successful, and every other edge has probability . We illustrate the graph in Figure 1 in the appendix.
Suppose there are rounds. SM matches to in the first round. Since the edge is successful, will be matched to in every subsequent round. Hence SM obtains a reward of .
We now define a different algorithm that selects the matching in round , for . With probability , both edges are successful in at least one of these rounds. In this case the algorithm can select this matching from round to round , obtaining payoff in these rounds. In expectation, the reward of the above algorithm is at least As SM only obtains a reward of , taking sufficiently large proves the claim.
The same example proves that Greedy-Commit is at most a -approximation to OPT. ∎
Finally, one might wonder how much of this gap between stable matching or Greedy-Commit and OPT is due to the fact that both algorithms commit to previously matched successful pairs, while OPT does not. Consider the optimal online algorithm that is restricted to committing to successfully matched pairs (OPT-Commit). OPT-Commit can be thought of as the optimal matching service that improves on Greedy-Commit by dictating a matching in each round in an optimal non-myopic manner, but again cannot incentivize matched agents to be rematched in future rounds. We note that the expected reward achieved by OPT-Commit is not always the same as OPT (see Appendix C ), so committing causes some loss of reward.
Furthermore, we show that OPT-Commit achieves at least a 0.5-approximation to OPT. Since SM and Greedy-Commit achieve at best a -approximation to OPT, if they do not achieve this then part of the loss is due to the fact that both SM and Greedy-Commit are myopic and do not account for the effect of current queries on possible matchings in future rounds, rather than the fact that both commit to previously matched successful pairs.
Proposition 0 ().
The expected reward of OPT-commit is at least a -approximation to OPT.
The proof of Proposition 5 relies on constructing an algorithm that commits to successfully matched pairs, and that is a -approximation to OPT. The algorithm (which we denote by ) proceeds as follows: in round , selects all the successful edges it has found so far, as well as the edges that OPT selects in round which can augment this matching. We show that in every round the expected size of the matching found by is at least a -approximation to the expected size of the matching selected by OPT. We provide a complete proof of Proposition 5 in Appendix C.
In sum, we have considered a number of different matching algorithms. SM is decentralized, while the rest are centralized matching platforms that determine and prescribe either a myopic maximum matching (Greedy-Commit) or a non-myopic optimal matching (OPT-Commit, OPT). All of the centralized matching platforms must hide some amount of information from participants, as the prescribed matchings in each round are not necessarily stable and hence not within-round incentive-compatible. In the case of OPT the matching proposed by the platform in addition is not across-round compatible, as it needs to coerce successfully matched agents to return and be rematched in future rounds. Both types of incentive constraints are not without loss, and in addition in order to implement either OPT-Commit and OPT the platform must solve an NP-hard problem. Nonetheless all of these platform designs attain an expected reward within a constant-factor approximation of OPT.
3. Capacitated Matching
In this section, we consider the difference between stable matching and optimal online matching in more general settings, such as where all agents can be in bilateral matches with multiple of any of the other agents, or agents can be grouped in teams. For example, styling and clothing rental companies may have limited inventory of different items to send to customers, and students may be able to take on multiple projects, or may be grouped into project teams.
Formally, we have a general graph where vertices have capacities given by arbitrary positive integers. In each of rounds, we can select a capacitated matching among these agents, i.e. some such that each has at most edges in incident to it. As in the previous section, each agent’s ’s goal is to maximize the expected number of successful matches they participate in over all rounds, and the platform’s goal is to maximize the weighted sum of the rewards collected in all rounds.
How can we define an appropriate decentralized outcome in this general setting? Suppose agents can observe their compatibility with every other agent. A similar argument to the previous section shows that each agent maximizes their individual expected number of successful matches by matching in each round with the agents who are most likely to be compatible with (including agents to whom has been successfully matched in the past). Hence, in the decentralized matching, there again will be no blocking pair of two agents who are not matched to each other but both prefer the other to at least one of their matched partners. Moreover, once a pair of agents is successfully matched they will continue matching with each other in future rounds. This determines a decentralized capacitated stable matching process, which we will again denote by SM.
In this general setting, we show that SM gives a constant-factor approximation to OPT.
Theorem 1 ().
In the setting where agents have arbitrary capacities, the expected reward achieved by the SM algorithm is at least a fraction of the expected reward of the optimal online algorithm.
Moreover, in many practical matching settings (e.g. gig economy applications, mentorship) matches occur between two sides of a market, and agents on one side of the market (e.g. jobs, mentees) only have capacity 1. In this setting, we can make use of the structure of the underlying graph to refine our approximation factor.
Theorem 2 ().
In the many-to-one setting, the expected reward obtained by SM is at least a fraction of the expected reward of the optimal online algorithm.
Next we consider a setting where for some set of agents , each subset of size at most has an associated probability of being “compatible” as a team. Generalizing the previous setting, in each of rounds, the agents are partitioned into teams so that each agent is in at most one team. We can equivalently think of the problem as constructing round-by-round matchings in a random hypergraph. We use modifications of the Domination and Charging Lemmas to show that the decentralized greedy formation of teams of constant size provides a constant-factor approximation to the optimal online algorithm. Formally, we show the following result, and provide a full proof in Appendix E.5.
Theorem 3 ().
In a hypergraph where all hyperedges have cardinality at most , the expected reward obtained by SM is at least of the expected reward of OPT.
3.1. What changes with capacities?
We first provide some intuition for what differentiates the capacitated setting from the setting in Section 2, where agents can only be matched with one other agent every round. This intuition will motivate the new proof techniques used in the capacitated setting, and also highlight the nuances in the decomposition used to prove Theorem 1.
The difficulty in extending the techniques in Section 2 lies in the definition of what it means to be an augmenting edge, which in turn determines the decomposition of the edges OPT selects into Aug and Adj. A crucial property underlying the proof of the domination lemma is that in the non-capacitated setting, whether a matching augments another matching can be determined by checking whether each augments .666An edge augments if and only if it doesn’t share any endpoints with edges in , a matching augments if and only if it doesn’t share any endpoints with edges in . Specifically, if each individually can augment , and the edges in form a matching, then all edges in can jointly augment . For natural definitions of augmentation in the capacitated setting (e.g. augments if is a capacitated matching), a statement of this sort is no longer true.
Hence, it is natural that in the capacitated case we might give a definition of constructed by jointly considering a group of edges in that can augment , instead of considering whether edges in could augment individually. In particular, we might define to be a maximum subset of edges in which can augment , and hope that we could claim for some constant . This natural approach unfortunately fails; the reason is that can only be formed using information that SM did not have available at the start of round .777In the previous domination lemma, the fact that we could augment edge-by-edge let us argue that could actually be bounded only using information that SM had available at round . This subtlety reinforces the need for the careful conditioning in the previous proof.
Given that these natural generalizations to the augmentation approach fail, we instead decompose successful edges OPT selects based on a notion of occupancy for each agent. We say that a node is heavily occupied if at least half their capacity is taken up by edges in . We decompose , the successful edges that OPT selects in round , into edges that are in , edges that are not in but have at least one of their endpoints heavily occupied by , and the remaining edges. Formally, we write
where Occ denotes all edges in with an endpoint that is filled to at least half-capacity from edges in , and Rem denotes the remaining edges in .
Later in Subsection 3.3, we show a tighter analysis of this same decomposition technique for bipartite many-to-one matchings results in a better approximation factor. In both Sections 3.2 and 3.3, we use a generalized version of the Charging Lemma to bound edges in Occ incident with nodes that are heavily occupied, and a generalized version of the Domination Lemma to bound edges in Rem incident only with nodes that are not heavily occupied.
3.2. A -approximation for Capacitated Matchings on General Graphs
In this section, we prove Theorem 1, and show that SM achieves at least a fraction of the expected reward of OPT in the general capacitated setting.
Recall that and denote the successful edges selected by SM and OPT, respectively, in round . The proof of Theorem 1 will follow from proving for all
Recall that we decomposed . This decomposition immediately lets us write a generalized Charging Lemma.
Lemma 0 (Generalized Charging Lemma).
We have .
Fix any sample graph . For each edge , charge it to an endpoint of with at least half of its capacity occupied by edges in . As Occ forms a valid capacitated matching, the charge placed on each vertex is at most its capacity. Additionally, note that we can cover each of these charges by having every edge in pay two charges to each of its endpoints. The result follows by averaging over all sample graphs. ∎
The main claim for which we require different ideas is a generalized Domination Lemma. Recall that and denote the new successful edges selected in round by SM and OPT, respectively. As before, we need to analyze the successful edges selected by OPT based on the round they were discovered; with this in mind we write
where denotes those edges in Rem which OPT selected for the first time in round . The Generalized Domination Lemma bounds against .
Lemma 0 (Generalized Domination Lemma).
For all we have .
Using the same notation in the previous proof of the Domination Lemma, for fixed histories , it suffices to show that
Conditioned on our sample graph being in , there is a fixed set of edges that OPT selects in round which are not in , and furthermore have neither endpoint filled to half-capacity by edges in . Using the same logic as before, we have that if denotes those edges in disjoint from and , then
as is some (random) subset of . In contrast to previous settings, it is not the case that the set of edges in can feasibly augment . However, we show there exists a subset of edges from with at least one-third of the total weight which can feasibly augment . To prove this, we need the following graph theoretic claim.
Claim 1 ().
In any weighted graph where each vertex has degree , there exists a subgraph whose edges contain at least of the total weight of all edges in , such that each vertex has degree at most in .
Consider the following algorithm for constructing . Of all remaining edges in , add the one of maximum weight to . Call this edge ; remove from . Furthermore, if has remaining incident edges in , remove one (chosen arbitrarily). If has remaining incident edges in , remove one (chosen arbitrarily). Continue until has no edges remaining.
Note that for any vertex , there are at most edges of incident to . Indeed, after adding any edge to that is incident to , we delete an edge in incident to if possible. Note also that in each step, the weight of the edge we add to is at least the weight of each edge we delete. Hence in each step, we add to at least a fraction of the weight of the edge we added and the edges we deleted. This holds in aggregate over all rounds. ∎
To finish the proof of the lemma, we use this claim to find a subgraph of with at least of the weight; we claim this subgraph is feasible to augment (and can be selected by SM in round ). Indeed, note it is disjoint from , and each vertex is filled to at most capacity in (because is a valid capacitated matching). Furthermore, every endpoint of an edge in is filled to at most capacity by edges in . Thus, is a feasible set of edges to augment in round . Well-known results also imply that in a fixed round, the SM algorithm gives a -approximation to maximum weight augmenting edges (see, e.g., (mestre2006greedy)). Hence,
This demonstrates the result. ∎
3.3. A -approximation for Many-to-One Matchings
In many practical matching applications, matches occur between two sides of a market, where agents on one side can accept multiple matches while agents on the other side can be matched at most once. For example, workers may take on multiple jobs which each only need one worker, and mentors are frequently matched with multiple mentees who each only have one mentor. In this section, we consider the performance of SM in this more restricted setting of many-to-one bipartite matchings. In particular, we show our previous decomposition achieves a better approximation factor.
Formally, we assume our agents are broken into two disjoint sets (think “students” and “mentors”); for each pair where and we are given a “success probability” representing the chance that an edge between and is realized when nature samples a random graph. Vertices in are on the left and vertices in are on the right. Each agent has capacity , and each agent has capacity . We show Theorem 5, which we restate here.
Theorem 5 ().
In the many-to-one setting, the expected reward obtained by SM is at least a fraction of the expected reward of the optimal online algorithm.
The proof will proceed by showing for all . Partition the edges of as
where Occ denotes edges in whose left endpoint is incident to some edge in , or whose right endpoint is occupied to at least half capacity by edges in , and Rem denotes the remaining edges in .888Note this is an extremely similar decomposition to that in the previous section. We do not have a separate category for edges in that are also in , as every is automatically in Occ.
Lemma 0 (Many-to-one Charging Lemma).
Fix any sample graph. For each edge in Occ, if it is adjacent to an edge in along its left endpoint, charge it to the unique edge in it is adjacent to. Otherwise, place a charge on its right endpoint. Each right vertex that is charged by this process is charged at most times by this process, and has at least edges in incident to it. So the charges on right vertices can be covered by charging each edge in at most twice. In all, we have charged each edge in at most three times; the result follows by averaging over all sample graphs. ∎
Lemma 0 (Many-to-one Domination Lemma).
For all we have
The intuition behind this lemma is that at round , there exists a feasible subset of the edges in that augments , with at least half the total weight. As before, the greedy property of SM algorithm gives a -approximation to maximum weight matching on a round-by-round basis. The rigorous proof follows the structure of previous domination lemmas very closely; details are available in Appendix E.4. ∎
4. Conclusion and Future Directions
This paper contributes to the literature on centralized and decentralized matching in platforms by providing an additional justification for focusing on decentralized algorithms. In particular, we consider matching platforms with repeated interactions between long-lived agents who have unknown but persistent preferences, such as gig economy applications, mentorship matching, kidney exchange, and team formation. We show that letting demand and supply myopically reach a stable matching in a decentralized manner approximates the outcome of computing and imposing a centralized matching. This is despite the fact that the centralized matching processes we consider are not incentive-compatible, and may even be NP-hard to compute.
We focused on a setting for matching with stochastic rewards and learning dynamics. While this setting has been studied in the query-commit literature, the primary motivation in that literature was failure-aware kidney exchange, and the primary focus of that literature was on querying individual edges in sequence to optimize some last-round objective. Beyond kidney exchange, this setting reflects natural learning dynamics that are present in a wide variety of markets that repeatedly match the same agents and provide persistent rewards, such as gig economy job markets, mentorship programs and team formation, and our focus on querying matchings and maximizing rewards received in all rounds is motivated by these settings. Further work can be done to identify when techniques from one approach can be transferred to the other.
Another difference from the query-commit literature is that we ask how a decentralized greedy algorithm compares to an optimal online algorithm that is not restricted to commit to past successes. While we show that OPT-Commit is a 2-approximation to OPT, further characterization of the relationship between OPTand OPT-Commit remains open. In addition, our results hold for any objective that takes a weighted sum of rewards across rounds. This is because the analysis throughout the paper was performed on a round-by-round basis. Objectives of potential interest included the sum of rewards across different rounds, expected discounted rewards, as well as the size of maximum successful matching that can be identified at the end of rounds, and we believe our analysis can be tightened for these specific objectives to produce sharper bounds. We also believe that there are sharper bounds for the capacitated settings. We leave the study of tighter approximation guarantees to future work.
More broadly, our findings raise a number of follow-up questions. Can platforms be designed in a way to nudge agents towards a decentralized outcome that approximates the optimal centralized matching achievable by the platform? What is the approximation gap between incentive-compatible matching platforms and what is achievable by a centralized authority? This paper also explores the problem of learning through prior assignments, a natural direction that is relatively less studied in the literature. Variations on this theme merit future exploration; for example, matching programs in highly relational and idiosyncratic settings like mentorship matching frequently face a cold start problem, where the program designer has some prior over which features best predict a successful match, and can refine this prior by observing the outcomes of prior matches. In mentorship or rotation programs an additional feature is that edge successes are often correlated; for example a student may think she has a general interest in economics and computation but learn that she is more interested in theory generally. We hope that this paper will motivate future work in these directions.
Appendix A Optimum Offline vs Optimum Online
To see that approximation to the optimal offline algorithm can be arbitrarily bad, consider a bipartite graph where each edge has probability of being successful, and where we only have one round. Clearly no online algorithm gets expectation better than . However, the optimal offline algorithm can simply see the realization of each edge, and then select the maximum matching; we claim that the expected size of this maximum matching is has size .
We give a loose bound; observe that for any edge , the probability it is included in the maximum matching of the realized graph is at least the probability that is realized and no edges adjacent to it are realized. In particular, this probability is at least . By linearity of expectation, the expected size of the maximum matching in the realized graph is hence at least .
Appendix B The Greedy-Commit Algorithm
Here we provide a formal description of the Greedy-Commit algorithm.
The Greedy-Commit Algorithm
Initialize the set of successful edges . Let
For rounds :
Find a matching between vertices in maximizing
Output as the selected matching for round .
If an edge in is successful, then add it to and remove both its endpoints from . Otherwise, set .
Appendix C OPT-Commit vs OPT
Recall that OPT-Commit is the optimal matching algorithm subject to the restriction that any edges it matches that are successful must also be matched in all future rounds. We first demonstrate that the OPT-Commit algorithm is not the same as OPT.
Proposition 0 ().
OPT-Commit algorithm is not the same as OPT.
Consider a complete bipartite graph with bipartition such that each edge is realized with probability ; say we are matching over two rounds. We show that OPT-Commit and OPT perform differently on this problem instance.
In the first round, OPT and OPT-Commit both select a matching of size 2. Without loss of generality we assume they both select edges and in the first round. If both and are realized, or if neither nor is realized, both algorithms will behave identically in the second round.
However, if only one edge is successful (say, ), OPT-Commit selects only edge in the second round, receiving reward 1 for the second round, while OPT selects edges and in the second round, hence receiving expected reward in this round. Hence OPT and OPT-Commit behave differently, and the expected reward achieved by OPT can be higher than that achieved by OPT-Commit. Note that the exact same example additionally shows that the Greedy algorithm, which queries the matching with the highest expected reward in each round, is different from the Greedy-Commit algorithm. ∎
Proposition 1 ().
The expected reward of OPT-commit is at least a -approximation to OPT.
We consider the algorithm , which in round , selects all the successful edges it has found so far, as well as the edges that OPT selects in round which can augment this matching. For any set of edges , we let denote the set of all vertices incident to at least one edge in .
Let and denote the successful edges selected by and OPT respectively in round . The proof proceeds by showing for all ; the result will follow as clearly OPT-commit performs at least as well as .
Let and denote the new edges selected by and OPT respectively in round , and let and denote the successful edges in and respectively. Fix any sample graph . For each round , observe that the new edges selects in round are precisely
Note that each successful edge OPT selects in round is either selected by as well, in which case , or edge is adjacent to a node in . (If edge was selected for the first time by OPT in round , and if edge was not adjacent to any successful edge selected by up to but not including round , would’ve been selected in round by as well.) As the edges of form a valid matching, for any fixed sample graph . Averaging over all sample graphs, . ∎
Appendix D Computing OPT is NP-hard
In this section, for completeness, we review the reduction [chen2009approximating] use to show that computing OPT-commit is NP-hard. We briefly note that the same ideas show that computing OPT is NP-hard.
Chen et al. reduce from the problem of determining whether a graph is -edge-colorable. Given a graph with , they construct a stochastic matching instance on over rounds where each each edge has probability of being realized; our objective is the total number of successful edges queried on the th round. If is -edge-colorable, a feasible strategy is to commit to all the successful edges we have seen thus far, and augment with all edges of color that we can in the th round. The expected reward of OPT in this case is at least
Indeed, note that gives the expected number of total successful edges found if we simply queried all edges of color in round . The expected number of edges this overcounts compared to our actual strategy is at most the expected number of pairs of edges that are adjacent, which is at most .
If is not -edge-colorable, in rounds we know at most total edges can be queried. Hence the expected reward of OPT would certainly be at most
as this upper bounds the number of distinct successful edges OPT could find across all rounds. Hence, computing the expected reward of OPT suffices to determine whether is -edge-colorable.
Appendix E Additional Omitted Proofs
e.1. Proof of Lemma 4 (Refined Domination Lemma)
The edges in
are all selected by OPT for the first time in round , and form a matching. Furthermore, they can augment . For convenience, denote these edges by .
We only require small changes from the proof of the Domination Lemma in the previous section. In particular, let be any possible history on the first rounds, and let be any possible history on the first rounds. If denotes the set of all base graphs which induce a history of on SM (for rounds), and denotes the subset of those that additionally induced a history of on OPT (for rounds), it suffices to show that
Conditioned on our sample graph being in , there is a fixed set of edges that OPT selects in round which can augment . Using the same logic as before, we have that if denotes those edges in disjoint from and , then
as is clearly a subset of the edges that OPT selects in round which can augment . We also note that SM in round selects edges with at least half the total weight of those in (they are all disjoint from ), so
This demonstrates the result. ∎
e.2. Proof of Theorem 2 (lower bound for Greedy-Commit)
Here we prove that Greedy-Commit is at least a -approximation to OPT. As the proof is very similar to the proof that SM is a -approximation, we only mention the key places where the proof differs. When analyzing Greedy-Commit we let denote the successful edges that Greedy-Commit selected in round , and let denote the new successful edges that Greedy-Commit selected in round . All other sets of edges are defined verbatim as before.
The Charging Lemma, which stated that , holds with the same proof. However, we prove a stronger version of the Refined Domination Lemma. In particular, we show that for any we have
The proof in Section E.1 holds verbatim, with the exception of penultimate sentence. Instead, we claim that Greedy-Commit in round selects edges with at least the total weight of those in (they are all disjoint from , and hence Greedy-Commit selects something at least as good as ), so
This proves a strengthened domination lemma. With these lemmas in place, we get the following factor-revealing LP, following the techniques we used to analyze SM against OPT.
Factor-revealing LP (Primal)