1 Introduction
Online matching is a fundamental problem in ecommerce and online advertising, introduced in the seminal work of Karp, Vazirani, and Vazirani [21]. While classical offline matching has a long history in the economics and computer science literature, online matching exploded in popularity with the ubiquity of the internet and emergence of online marketplaces.
A common scenario in ecommerce is the online sale of unique goods due to the ability to reach niche markets via the internet (e.g., eBay, etc.). Typical products include rare books, trading cards, art, crafts, and memorabilia. We will use this as a motivating example in describing our setting. However, our problem can also model job search/hiring, assigning workers to tasks, online advertising, and other online matching problems.
In classical online matching, we are given a known set of offline vertices that may represent items for sale or ads to be allocated. There is also an unknown set of online vertices, which may represent customers, users, or visitors to a webpage. The vertices of arrive online in some fashion. Generally, the customers (or users, webpage visitors, etc.) arrive onebyone and a decision to match each customer or not (and if so, to which item) must be made irrevocably before the next customer is revealed. The original problem was introduced by Karp, Vazirani, and Vazirani [21] in 1990 and became a foundational work in ecommerce with the subsequent rise of internet marketplaces and online advertising. In this original formulation, customers arrived in adversarial order and we focus on this model since the results extend to other arrival models which we describe in Section 1.1.
Since the introduction of online matching, many generalizations and variants have been studied to capture emerging problems in the real world. The major generalizations we consider include vertexweights (posted prices), stochastic rewards, and patience. Vertexweights on the offline vertices correspond to item prices/profits or the reward for making a match. In the stochastic rewards model (sometimes called stochastic matching) [5, 27] each edge is given a known, independent, and unique probability of existing. When an online vertex arrives, the probabilities of each of its incident edges are revealed. In ecommerce, this may model the probability that a customer will purchase a given item. In online advertising, this corresponds to the payperclick
model, in which ad revenue is only earned when a user clicks on an ad (the probabilities may be inferred or estimated based on historical data of the user). See
[26] for further discussion and models.In both the ecommerce and advertising settings, we only discover if a customer or user would purchase an item or click an ad after it has been presented to them and they have done so (or not): that is, we cannot later choose to “revoke” the item offer or ad placement. This situation is captured by the probecommit model: if a stochastic edge is probed and found to exist, it must be matched irrevocably. In the most basic stochastic rewards setting, we are allowed to probe at most one edge adjacent to each arriving vertex while offline vertices may have many edges probed until they are matched and become unavailable [27, 10]. Think of a single banner ad on a website for example. However, in this paper we consider a further generalization called patience constraints (also known as timeouts in the literature) where an online vertex has a known patience and we may probe up to neighbors (stopping early if it is successfully matched) [5, 1, 12]. This corresponds to a user browsing multiple items until they either find something to buy or lose patience and exit the marketplace. Alternatively, this may correspond to the number of ad impressions a user may be shown while browsing a website or mobile app,and thus, the number of opportunities to show an ad that the user ultimately clicks. In our model, we make the standard assumption that offline vertices have unlimited patience.
Note that although this problem is “patienceconstrained”, it is actually more general than the classical online matching problem or the stochastic rewards variant [27], since the latter two essentially have patience values of for online vertices, while patience can be arbitrary in the “patienceconstrained” problem.
1.1 Problem Landscape and Related Work
1.1.1 Overview of Online Matching Variants
We describe and define several variants of online matching and survey some relevant works for these variants. We group variations into the following categories which can be combined.

Arrival models: how online vertices arrive (adversarial, random order, or known IID).

Weights: unweighted, vertexweighted, or edgeweighted matching.

Stochastic rewards/stochastic matching: classical deterministic edges, stochastic edges, or patience constraints.

bmatching: offline vertices are allowed to be matched multiple times.
Whenever we do not explicitly specify a variant, our default will be the original definition from [21] of adversarial arrivals, unweighted matching, and classical (nonstochastic) edges.
Arrival models. Three common arrival models for online matching are: adversarial, random order, and known IID. The model first studied by [21] used adversarial arrivals, where the online vertices arrive in adversarial order. They showed that the Ranking algorithm (randomly permute the offline set and matches online vertices to the first available neighbor in the permutation) achieves a tight competitive ratio of and that no online algorithm can do better.
Subsequently, less pessimistic arrival models were introduced to capture real world problems with more friendly input. In random arrivals, the online vertices arrive in random order and the best known competitive ratio is , due to the analysis of Ranking for this problem by Mahdian and Yan [23]. There is also a hardness of due to Goel and Mehta [15]. In known IID, we are given a bipartite graph upfront with the online partition representing vertex “types.” Each arrival is sampled with replacement from a known distribution over these online vertices. Multiple copies of the same online vertex from this original graph may arrive and are treated as separate distinct vertices. Thus, the online vertex that arrives at each stage is independent and identically distributed (iid) with respect to all other online vertices. This arrival model simulates prior knowledge about expected user behavior. Here, the current best result is due to Brubach et al. [10].
Observe that an algorithm for adversarial arrivals will perform the same or better in the other two models and similarly, an algorithm for random arrivals can be applied to known IID [26]. We note that known IID has sometimes been called “online stochastic matching” due to having stochastic arrivals. However, we use the term “stochastic matching” exclusively to refer to stochastic edges (aka stochastic rewards) and explicitly mention arrival models (e.g. online matching with known IID arrivals).
Weights. We may be interested in more general settings in which offline items may have different posted prices. This is captured by the vertexweighted variant, in which each offline vertex is given a nonnegative weight. The best known results are [2], [18], and [10] for adversarial, random order, and known IID, respectively. This setting is further generalized by edge weights, in which each potential buyer may offer to pay a different price for a given item (i.e., there is a different price for each itembuyer pair that is revealed with the buyer). With weights specified for each edge of the bipartite graph, the problem in the adversarial arrival model can have an arbitrarily bad competitive ratio. For random order and known IID, best known results are [22] and [10], respectively.
Stochastic edges. While the classical problem is useful for modeling several scenarios, there are many more problems in ecommerce and online advertising which involve notions of uncertainty in the reward or utility achieved from decisions. The traditional notion of “deterministic edges” does not capture these more complex settings. For example, in the payperclick model of advertising, we do not know a priori that presenting a given ad to a user will result in a clickthrough (and thus, ad revenue). This uncertainty is captured by the notion of stochastic edges, in which there is some probability that a given edge will or will not exist. This notion was introduced in [5] as stochastic matching with timeouts (patience). They considered a model where each edge has a known, distinct, and independent probability of existing. In the probecommit model, we first probe an edge to find out if it exists and if so, we must match it irrevocably. In the patience/timeouts version of [5], online vertices each have a patience and we can probe up to neighbors, but must stop if a match is made. In this and most related work, offline vertices have unlimited patience (although twosided patience has been studied [13]). The work of [5] considered known IID arrivals and showed a ratio of which was eventually improved to [13]. We refer to this model simply as matching with patience.
Later, [27] introduced a special case of stochastic edges called online matching with stochastic rewards. The stochastic rewards model can be seen as a special case where each online vertex has a patience of . They studied this problem under adversarial arrivals. Under the restricted case of uniform edge probabilities, they showed that is possible (by choosing the best of the two algorithms studied although is not explicitly stated). This was extended to show a ratio of for unequal, but vanishingly small probabilities [28]. However, for arbitrary edge probabilities, a trivial ratio is the best known. There is also a hardness result in [27] which claims that no algorithm for stochastic rewards with adversarial arrivals can achieve a competitive ratio greater than (strictly less than ), but we argue that this result arises from a different definition of competitive ratio which is too pessimistic. Therefore, we claim this hardness result does not hold under the common definition of competitive ratio for this problem.
Observe that patience generalizes stochastic rewards and that both generalize the classical nonstochastic model. Another, more general model of stochasticity is presented in [16]. In their model, when a vertex (viewed as a customer) arrives online, an online algorithm chooses a set of potential matches for (viewed as an offering of products to the customer). Each customer (online vertex) has a general choice model which specifies the probability of the customer purchasing each item when offered each possible set of product assortments . We discuss this model in more detail in Section 1.1.2, but note that in this setting, a set of potential matches is chosen all at once rather than probed sequentially, with the outcome being determined by full set (the offered product assortment).
The matching variant. One may further consider the case where we may allow offline vertices to be matched multiple times. This captures the notion that we allow ads to be presented to multiple users (in the nonstochastic case) or clicked by multiple users (in the stochastic case). Similarly, in ecommerce applications, this corresponds to having multiple items of the same type which can be sold to multiple users. This is captured by the notion of bmatching, as studied in [20]. In this generalization, each offline vertex has a capacity, , and we allow to be matched up to times. Standard online matching can be seen as a special case of this problem where for all . Note that we can extend results for the classical matching problem to that of matching by making copies of each offline vertex . Note that the capacities, which restrict the number of times a vertex may be successfully matched, are different from patience constraints, which restrict the number of attempts each vertex has to be matched (that is, patience constraints count the number of failed attempts at a match, while capacities only care about the number of successful matches). The online matching problem was first introduced in [20], which considered the unweighted, nonstochastic setting in the adversarial model, and presented an optimal competitive algorithm for the case where all offline vertices have capacity at least (note that for large , this approaches ). For the Known IID arrival model with stochastic rewards and edge weights, Brubach et al. [10] showed that the competitive ratio is at least for any (note that, contrary to the adversarial model, this ratio approaches for large ). While our competitive algorithm (Theorem 1.6) extends to the matching case (by duplicating vertices as described above), we leave as an open problem whether this can be improved; we note however that the result of [20] provides an upper bound of for large .
Further Generalizations. In [29], Meir et al. consider a deterministic model in which the online vertices are rational agents who make matching choices: They will choose the offline vertex which maximizes their utility (defined as the difference between their preference valuation of the choice and the posted price of the choice). The problem is then to design a mechanism for setting the posted prices of each alternative so as to maximize the social welfare (the sum of the valuations of all the agents final choices). Our model differs significantly in that matching decisions are made by the algorithm rather than by agents, edge rewards are stochastic, and the goal is to maximize the expected total weight (profit) of the matching rather than the expected welfare. While [29] models problems such as a parking mechanism with the goal of maximizing the benefit of all agents, our setting models problems such as ecommerce and online advertising. The Prophet Inequality Matching problem [3] may be viewed as a variant of edgeweighted matching with an arrival model similar to Known IID, but in which each stage of the online arrivals may have a different (though still independent) known distribution. Alaei et al. [3] also considered the Budgeted Prophet Inequality Matching problem, where offline vertices instead have budgets limiting the total amount of weight that may be allocated to them, rather than the number of vertices (note that in the special case of vertex weights, the budgeted version is equivalent to the version with capacities). We note that this variant does not consider stochastic rewards or patience constraints.
1.1.2 Our Setting and Related Works
In this work, we consider the setting with vertex weights, stochastic edges in the probecommit model, and patience constraints. In what follows we review some related works which are more closely tied to this setting.
The work of [16] considers a model in which online vertices represent customers and offline vertices represent products, and a merchant wishes to offer products to consumers so as to maximize profit. This setting differs from our own in that the merchant offers a collection of several products all at once. The customer then either chooses to purchase some product (and in fact, may purchase multiple products at once), based on products offered to her, or chooses to purchase nothing. By contrast, in our model the algorithm (the “merchant” in our setting) attempts one match at a time, stopping when a successful match occurs or the number of unsuccessful attempts equals the patience constraint.
In the setting of [16], each customer has a “general choice model” that specifies the probability that customer purchases item when offered the set of items. More generally, since the model considers that may purchase more that one item, is used to denote the probability that will purchase exactly the items when offered (and then is defined to be ). It is assumed that the customer will only purchase products that were offered to her as part of the assortment (that is, if ).
The algorithm they propose for their model can be viewed as a greedy algorithm which presents an onlinearriving customer with the set that maximizes the expected profit of the items purchases. Doing so would guarantee a competitive ratio of at least , though this maximization step is not necessarily solvable in polynomial time for arbitrary choice models (they present only a specific family of choice models for which this step can be solved in polynomial time).
Their results do not immediately extend to our setting, as their stochastic model is somewhat different. Extending their results to our setting requires a reduction from our sequential probing with the probecommit model to this allatonce model by construction of appropriate choice models . Further, such a reduction would not necessarily yield a polynomialtime result without also designing an algorithm for solving the aforementioned maximization in polynomial time.
One contribution of the present work is Algorithm 1, which indeed can be viewed as greedily maximizing the expected weight (or profit) of ’s match (or purchase). However, without also constructing a reduction from our sequential probing model to this allatonce model, the result of [16] does not extend to give a competitive ratio of for our problem. Rather, in the present work, we present clean, selfcontained analyses of Algorithm 1 and Algorithm 2 to achieve a competitive ratio of for our problem without relying on the results of [16] and without the need for a messy or complicated reduction.
Another work closely related to our model is that of [17], which considers a model very similar to ours, with stochastic rewards and vertex weights. They do not consider arbitrary patience constraints (i.e, they consider only the special case where for every online vertex ). They present a competitive algorithm for the special case of decomposable probabilities: that is, the case where for every edge . They further show their algorithm is competitive for the case of vanishing probabilities, where for all . They do not consider the more general setting of patience constraints.
Adversarial  Unweighted  Vertexweighted  Edgeweighted 
Nonstochastic  [21] (tight)  [2] (tight)  – 
Stochastic rewards  [27] ( [27] )  –  
Patience  –  
Random order  Unweighted  Vertexweighted  Edgeweighted 
Nonstochastic  [23] ( [15])  [18] ( [15])  [22] 
Stochastic rewards  [27]  
Patience  
Known IID  Unweighted  Vertexweighted  Edgeweighted 
Nonstochastic  [11] ()  [11] ()  [10] 
Stochastic rewards  [11]  [11]  [11] 
Patience  [13]  [13]  [13] 
1.2 Our Contributions
Here, we give a rough outline of the paper and our contributions.
1.2.1 Clarified competitive ratio
Our first contribution in Section 3 is to argue for a unified definition of competitive ratio for online matching problems with stochastic rewards. We give the following definition which aligns with the prior work of [5, 1, 10, 12], but differs crucially from [27].
Definition 1.1 (Competitive Ratio for Online Matching with Stochastic Rewards).
This competitive ratio is defined as the ratio of an online algorithm’s solution to the solution of an optimal algorithm for the corresponding offline stochastic matching problem.
An important consequence of this definition, stated in Observation 1.2, is that the hardness result of [27] does not apply. The definition of competitive ratio in [27] compares the online algorithm to the solution of the Budgeted Allocation LP (equivalent to the LP in Section 3.1 with all ) rather than to the offline stochastic matching problem. It is known that the Budgeted Allocation LP upper bounds the offline stochastic matching problem. Thus, the positive results of [27] are unaffected by Definition 1.1 since their LP formulation still serves to upper bound the optimal offline stochastic matching solution under Definition 1.1.
Observation 1.2.
We further show in Section 3 that definition 1.1 allows for a more granular comparison between online algorithms. The significance of Observation 1.2 is that it reopens the question of whether a tight bound on the competitive ratio can be achieved in the stochastic rewards with adversarial arrivals setting.
1.2.2 New LP stochasticity gap
In the process of discussing the competitive ratio, we also show that the standard LP formulation for the stochastic matching problem with patience (timeout) constraints [5, 1, 12] is a fairly weak upper bound on the optimal solution. We call this a stochasticity gap (defined formally in Section 3.2) of the LP relaxation, analogous to the familiar concept of an integrality gap. Theorem 1.3 states that the natural LP (LP 1 in Section 3.1) for the offline stochastic matching problem with patience (timeout) constraints has a stochasticity gap of at most . This is similar to the stochasticity gap described in [12] for the LP of the more specific problem of online matching with stochastic rewards (equivalent to the Budgeted Allocation LP).
Theorem 1.3.
This implies that the competitive ratio achieved in [12] for the online stochastic matching problem with patience (timeout) constraints and known IID arrivals is somewhat tight with respect to the LP used in that paper to upper bound the optimal solution and guide the online algorithm. Note that this means their online algorithm achieves compared to the LP solution while no offline algorithm can perform better than with respect to the LP solution. Thus, serious improvements to that problem will only be possible with a tighter upper bound and only measurable under Definition 1.1 as we will see in Section 3.
1.2.3 Optimal offline stochastic matching on star graphs
In Section 4, we introduce a dynamic programming algorithm that solves offline stochastic matching on star graphs optimally (the bestknown approximations for offline stochastic matching with patience constraints in bipartite graphs and general graphs are 0.35 [1] and 0.31 [6], respectively). This algorithm will be used as a subroutine in our final online algorithm, where we will view an arriving online vertex as the center of a star graph, and its optimality — stated as Theorem 1.4 — will be used in our analysis.
Theorem 1.4.
There exists an algorithm for the offline stochastic matching with patience problem on vertexweighted star graphs that finds the optimal probing strategy in time where is the number of vertices and is the patience of the center vertex.
1.2.4 Greedy algorithm for vertexweighted online matching with stochastic rewards and patience constraints
In Section 5, we start by showing that an obvious, naive greedy algorithm for this problem can be arbitrarily bad as stated in Theorem 1.5.
Theorem 1.5.
Probing neighbors of an arriving vertex in nonascending order of expected weight () leads to a worst case competitive ratio of where is the number of offline vertices in the underlying graph.
We then demonstrate how to achieve a competitive ratio by locally optimizing for each arriving vertex and its neighborhood using the dynamic programing algorithm for star graphs from Section 4. We formalize this in Theorem 1.6.
Theorem 1.6.
There exists an algorithm which achieves a competitive ratio for the vertexweighted online matching problem with stochastic rewards and patience in the adversarial arrival model.
Since the model in Theorem 1.6 is quite general and adversarial, the result extends to the unweighted problem as well as random order and known IID arrival models as stated in Corollary 1.7.
Corollary 1.7.
There exists an algorithm which achieves a competitive ratio for the vertexweighted and unweighted online stochastic matching with patience problems in the adversarial, random order, and known IID arrival models.
We note further that the performance of our algorithm is tight with respect to greedy algorithms which optimize the performance of each arriving vertex locally instead of attempting to make globally optimal decisions across all arrivals.
While greedy matching algorithms generally have poorer worst case performance theoretically, Figure 1 illustrates that our result is currently the best known for a number of natural online matching problems. We also stress that greedy algorithms can be useful in practice and it is important to establish the difference between our greedy algorithm and the naive greedy approach, which one might be tempted to implement. The recent empirical work of [8] for nonstochastic online matching under known IID arrival gives evidence that simple greedy algorithms perform well on this problem in practice. They further observe that the theoretically superior algorithms of [10, 19, 24, 4, 14] can be augmented with greedy choices to add additional adaptivity which improves empirical performance. Indeed, many of their best results come from these greedyaugmented algorithms. Thus, greedy algorithms can play an important role in practical solutions to this problem and it is useful to understand their behavior.
Finally, since we are not subject to the hardness result of [27] under Definition 1.1, one might ask if the problem actually gets easier when restricted to vanishingly small edge probabilities (or even just probabilities strictly less than 1). For example, the algorithms proposed for the stochastic rewards problem with adversarial in [27] and [28] achieve their best results for vanishing probabilities. Further, standard adversarial inputs where a matched edge “blocks” a future potential match do not present the same bounds when the “blocking” edge has a low expected value before being realized. In one step toward addressing this question, we show in Theorem 1.8 that the hardness result of for greedy algorithms extends to the special case of stochastic rewards with small edge probabilities. We use “SimpleGreedy” to refer to the algorithm which always probes an arbitrary available neighbor if one exists. We note that [27] showed that “SimpleGreedy” achieves a ratio of at least in the stochastic rewards with adversarial arrivals setting.
Theorem 1.8.
There exists a family of unweighted graphs under stochastic rewards (online vertices with patience of 1) and adversarial arrivals for which SimpleGreedy achieves a competitive ratio of at most even when all edges have uniform probability .
2 Preliminaries and Notation
We use to denote the bipartite graph with vertex set and edge set . For a given bipartite graph , let represent offline vertices and represent online vertices. Let denote the weight of offline vertex . We assume, wlog, that . For each edge , let denote the given probability that edge exists when probed. We also use for the given probability of edge when indices and are not required. For simplicity, we may assume is the complete bipartite graph with by allowing for nonexistent edges. Thus from here on, when we refer to an edge as incident to a vertex , or to being adjacent to or a neighbor of , we mean that (i.e., that edge has a positive probability of existence). We are further given a patience value for each online vertex that signifies the number of times we are allowed to probe different edges incident on when it arrives. Note that each edge may be probed at most once and if it exists, we must match it and stop probing (probecommit model).
Strictly speaking, specifies a probability distribution on input graphs, the true realization of which is initially unknown. We denote this realization graph by , where consists only of edges which exist when probed.
We consider the online vertices arriving in stages. Specifically, we may assume (without loss of generality) that the online vertices arrive in the order and number the stages through correspondingly. When the online vertex arrives at stage , we attempt to match it to an available offline vertex. We are allowed to probe edges incident to onebyone, stopping as soon as an edge is found to exist, at which point the edge is included in the matching and we receive a reward of . We are allowed to probe a maximum of edges; if edges are probed and none of the edges exist, then vertex remains unmatched and we receive no reward. If we successfully match to , we say that is the value or reward of ’s match; if remains unmatched, we say it has a value or reward of .
3 Unifying and clarifying the competitive ratio
A key argument in this paper is that we should unify and clarify the definition of competitive ratio for the problem of online matching with stochastic rewards. Currently, the most common definition [5, 1, 10, 12] compares an online algorithm to the offline optimal for the corresponding offline stochastic matching problem introduced in [5]. However, in [27], they compare an online algorithm to a specific nonstochastic offline packing problem called Budgeted Allocation. Budgeted Allocation is equivalent to LP 1 in Section 3.1, but with all since it addresses the stochastic rewards problem without patience. We note that the Budgeted Allocation LP and its natural extension to patience constraints (LP 1) both upper bound the corresponding offline stochastic matching problems. However, we argue that these LPs should not be used as tight bounds to prove hardness results as with Budgeted Allocation in [27].
We use to refer to the first definition (Definition 1.1) since it compares to a stochastic offline problem (offline stochastic matching) and to refer to the definition from [27], which compares to a nonstochastic offline problem (Budgeted Allocation). In this section, we advocate for as the canonical definition of competitive ratio for online matching problems with stochastic rewards. Use of the term “competitive ratio” in the analysis of future sections refers to .
We give the following reasons for choosing :

is more in line with the standard concept of competitive ratio. We compare an online algorithm for a problem to the offline optimal for that same problem.

It enables finer grained comparison between algorithms. We will describe an example below where an online algorithm which is optimal under the definition can be improved upon under .
In [27], they argue against three specific potential definitions of competitive ratio. We agree with those arguments. However, we further argue that their definition, , is too pessimistic. To support point (2) above, we note that in [11], they show an algorithm for online matching with stochastic rewards and known IID arrivals that achieves a competitive ratio of (under both and ). Under , this result would be tight with no further improvement possible. However, under (the definition used in [11]), they also show that for the case of uniform constant edge probabilities, a competitive ratio of is possible using a more constrained LP to guide the algorithm. This finer granularity of analysis supports the develop of improved algorithms that would be impossible to see under .
In Section 3.1, we describe the natural LP which is often used as an upper bound on the offline optimal under and as the definition of offline optimal under (called Budgeted Allocation in [27]). Then, in Sections 3.2 and 3.3, we show how there is a large gap between the solution to this LP and the optimal offline stochastic matching solution. This implies that under , no online algorithm for stochastic matching with patience could achieve a competitive ratio better than even under the more tractable known IID arrival model. We believe this is far too pessimistic and conjecture that algorithms which exceed that ratio under will be found in the future.
3.1 Standard Linear Programing Relaxation
Below is a natural extention of the “standard” LP formulation (e.g. as in both [5, 1, 10, 12] and [27]) to vertex weights and nonunit patience values.
maximize  (1)  
subject to  (1a)  
(1b)  
(1c)  
(1d) 
We note that the Budgeted Allocation problem LP of Mehta and Panigrahi [27] uses different notation, but is equivalent if all are set to . The notation we use here is in keeping with [5, 1, 10, 12]. Some formulations such as [10, 12] use edge weights instead of vertex weights in the objective or consider additional patience constraints on the offline vertices (twosided timeouts in [12]).
In most cases, such an LP is used to upper bound the optimal solution. The first two constraints are nonstochastic relaxations of the matching constraint. The third constraint enforces the patience constraint that a vertex can be probed at most times. However, this is not a tight bound on the optimal solution as we will see in Section 3.2 which defines the concept of a stochasticity gap between an optimal stochastic matching algorithm and the optimal solution to LP 1. Later, in Section 5.3, we present a new LP formulation (LP 3) that introduces additional new constraints. Being more constrained, this new LP provides a tighter bound on , although it is open whether it achieves a provably better stochasticity gap than LP 1.
3.2 Stochasticity gap
In keeping with our mission, we now formalize the definition of a stochasticity gap for matching problems. The term was first used casually in this context in [12] without a rigorous definition. They showed that there is a gap of at least between the LP solution and an optimal algorithm’s solution. In other words,
For the offline problem, this gap arises with a star graph on vertices with each edge having and the center vertex having unlimited patience (or equivalently ). To create a similar problem instance for online matching with stochastic rewards, let the center of the star be the offline set (with unlimited patience) and the remaining vertices be the online set. In both cases, the LP can assign to all variables to get while for large since there is a probability of no edge existing. We give a more general definition below which captures this concept.
Definition 3.1 (Stochasticity Gap).
The ratio of the optimal algorithmic solution for a stochastic packing problem to the optimal solution of a linear programming relaxation which treats probabilities as deterministic fractional size coefficients.
3.3 A larger stochasticity gap
As stated in our problem definitions, online matching with stochastic rewards and patience constraints (aka online matching with timeouts) generalizes the online matching with stochastic rewards problem. It allows each online vertex to probe up to neighbors when it arrives instead of just one.
To see a larger gap for this problem, consider the unweighted, complete bipartite graph ( vertices in each partition) with for all edges and unlimited patience (or equivalently ) on all vertices. In this case, LP 1 has value of , achieved by assigning a value of to each variable. However, we can upper bound with the expected size of the maximum matching in the realization graph of all edges that actually exist. In other words, imagine we probed every edge and sought a maximum matching in the graph of edges that were found to exist.
To do this, we first mention the following result due to [7] for random graphs. Note, however, that Theorem 14 of [7] is slightly more general, providing a bound on the size of the independent set for . Lemma 3.2 states the special case for .
Lemma 3.2 ([7]).
Let be a random bipartite graph with both partitions of size and where each edge exists independently with probability . Let be the solution to the equation . Then, the largest independent set of has size with probability .
The proof of Theorem 1.3 can be derived from Lemma 3.2 as follows. Lemma 3.2 implies that, almost surely, a minimum vertex cover for has size (asymptotically, as ) , and by Kőnig’s Theorem this is equivalent to the size of the maximum matching in (see also [25, 9]). It follows, then, that no online or offline algorithm can achieve an expected matching size greater than on this graph. This shows a stochasticity gap of at least .
Thus, Linear Program 1 can overestimate the true optimal value quite drastically considering that the stochasticity gap of LP 1 is an upper bound on the competitive ratio for any algorithm using that LP to upper bound and we hope for a competitive ratio higher than for many online matching problems. In particular, the algorithm in [12] achieved a competitive ratio for the edgeweighted, known IID arrivals variant using LP 1 as an upper bound (as well as to guide the algorithm). We can see here that this ratio cannot be improved beyond even in unweighted graphs without using a tighter upper bound.
Another problem with defining the competitive ratio as is that it would imply that no online algorithm can achieve a ratio better than . However, using the definition opens up the possibility of using a tighter bound on the offline optimal solution than LP 1.
4 Optimal Offline Stochastic Matching Strategy for Star Graphs
Here, we prove Theorem 1.4 by describing a dynamic programming algorithm for solving edgeweighted offline stochastic matching with patience on star graphs—that is, bipartite graphs in which consists of a single vertex. Note that in this case, the edgeweighted and vertexweighted problems are equivalent and the only patience constraint we need to consider is on the center vertex.
To see the relationship between this problem and online matching, observe that when an online vertex arrives, we are given the star graph of and its offline neighbors in . A greedy algorithm seeks a probing strategy which maximizes the expected weight of a matching in that star graph. Consequently, observe that offline matching on a star graph is equivalent to online matching with a single online vertex and a single online stage.
Let be a star graph with center vertex , and let denote the set of ’s neighbors in . Suppose and . The optimal strategy for matching is the one which maximizes the expected value of ’s match. We show a dynamic programming approach which finds this optimal solution. Let denote the probability that edge exists when probed.
A crucial observation is that any optimal probing strategy will probe in nonincreasing order of edge/vertex weight, regardless of the edge probabilities. Intuitively, this results in matching to its highest weight neighbor with an edge that actually exists among the subset of that is being probed. Claim 4.1 states this more formally.
Claim 4.1.
Let be a subset of the offline vertices. Consider querying all edges in the set according to some ordering. Ordering the edges by decreasing weight maximizes the expected weight of ’s match (with respect to all other orderings of ).
Given Claim 4.1, we may restrict our probing strategies to those which probe edges in nonincreasing order of weight. For ease of exposition, let be sorted in nonincreasing order of weight such that . For our dynamic program, we define to be the maximum possible expected value of any decreasingweight probing strategy that is allowed probes and probes edge first. The full definition of is given in (2).
(2) 
The following lemma states that this formulation does indeed provide the expected weight of the optimal probing strategy.
Lemma 4.2.
For a patience of , the value is equal to the expected value of the optimal probing algorithm.
Proof.
We begin by showing that for all and for all , the value equals the maximum possible expected matching weight with probes, if we first probe edge and proceed probing edges in decreasing weight. We proceed by induction on . Clearly, this holds for (for all ). Now suppose that for all , equals the maximum possible expected matching weight with probes, if we first probe edge (and subsequently probe edges of lower weight).
If we probe first, we achieve an expected matching size of , where is the expected matching weight achieved by the remaining probes. By the inductive hypothesis, this is maximized (over all vertices whose weight is less than ) by .
It follows from the above that given a patience , the value represents the maximum possible expected matching weight if we are restricted to probing edges in order of decreasing weight. However, it follows from Claim 4.1 that such a strategy is also optimal over all possible probing orders, and thus is the expected value of an optimal probing algorithm. ∎
Clearly, we can construct a table storing all values of with and using at most space. Computing each cell of the table requires at most time to find resulting in at most time. Since must be less than , the time and space are guaranteed to be polynomial in the size of the input. This procedure is stated explicity in Algorithm 1.
5 Greedy Algorithms For Online Stochastic Matching
While greedy algorithms can provide powerful heuristics for online matching problems
[8], it is not obvious how to behave greedily in the presence of vertex weights, stochastic edges, and patience constraints. We first illustrate how a naive greedy approach fails. We then show how to optimally probe a star graph. Finally, we analyze this greedy algorithm to bound its competitive ratio at which is tight for worst case analysis of greedy algorithms for online matching.5.1 A naive greedy approach that fails
One natural idea which may appear to generalize the common greedy approaches to similar problems is to sort the neighbors of an arriving vertex in nonincreasing order of expected weight, . However, the competitive ratio of this approach can be arbitrarily bad as described in Theorem 1.5.
Proof.
Consider the following underlying bipartite graph. Let the offline vertex set contain one vertex of weight and vertices of weight , where is some very large number. Let there be exactly one online vertex with an edge of probability to the vertex with weight and edges of probability to the remaining offline vertices for . Let have a patience , meaning that it can probe as many neighbors as we want until it is matched or we run out of neighbors to probe. Note that we can always add “dummy” vertices (vertices with no neighbors) to the online set if we want to capture the setting where both the online and offline sets are large.
The strategy of sorting by expected weight will first probe the edge because it has the largest expected weight of while the other edges have an expected weight of . Since the edge has probability of existing and we are in the probecommit model, this would match to deterministically, earning a weight of . However, the optimal algorithm would probe last (probing first the vertices ), earning an expected weight of . ∎
Thus, to properly generalize greedy approaches to this setting, we need to fix a probing order that maximizes the expected weight achieved by an arriving vertex. In Section 4, we show how to do this using dynamic programming.
5.2 A Competitve Online Algorithm
We present a greedy algorithm which achieves a approximation for online matching with vertex weights, stochastic rewards, and patience constraints in the adversarial arrival model. In this setting, the vertices of arrive in an online fashion. If vertex is the vertex to arrive online, we say arrives at time .
The algorithm is as follows. When a vertex arrives at time , let denote the set of offline vertices which are still unmatched. Solve the Dynamic Program 2 on the star subgraph . Probe edges in the order given by the dynamic program ( is matched to the first vertex for which the edge exists when probed). This procedure is stated explicitly in Algorithm 2.
Let denote the expected size of the matching produced by this algorithm on the graph . Let denote the expected size of the matching produced by an optimal offline algorithm. Our main result is given by Theorem 5.1.
Theorem 5.1.
For any bipartite graph , .
5.3 A new LP with a tighter upper bound on
We formulate a new LP by adding a new constraint to LP 1. This new LP gives a tighter upper bound on the offline optimal solution, . Note that our algorithm does not need to explicitly solve this new LP. We simply use it in our analysis.
This new constraint is motivated by the observation that the dynamic program (2) is optimal for star graphs. Intuitively, for any subgraph of , the optimal solution for cannot match more edges in expectation than . We can represent this as a set of constraints which restrict the LP to ensure that the expected number of matched vertices on any star subgraph of does not exceed the optimal value given by the dynamic program (2) for that same star subgraph. This is captured in the new constraint, (3d), in the linear program 3 below. In this constraint, we slightly abuse notation and write to denote for a star graph . Recall that is given by the dynamic program (2).
maximize  (3)  
subject to  (3a)  
(3b)  
(3c)  
(3d)  
(3e) 
Let denote the value of Linear Program 3 on the graph . Lemma 5.2 states that this new LP is still a valid upper bound on the optimal solution.
Lemma 5.2.
For any bipartite graph , .
Proof.
Consider an adaptive offline algorithm which is optimal. Let be the probability that this strategy probes edge . For any vertex , the probability that is successfully matched is at most , and similarly for the probability of successfully matching any online vertex . Thus, this assignment satisfies constraints (3a) and (3b). By the definition of , we cannot probe more than edges incident on an online vertex . So constraint (3c) is satisfied.
Finally, we argue that the new constraint (3d) is satisfied by this assignment. Suppose instead there is some vertex and some for which . Then, we can define a new offline probing strategy on the star graph which simply simulates our original algorithm on and probes only those edges which are in . This achieves an expected matching weight on the star graph of at least , but this contradicts the fact that is the optimal expected matching weight for the star graph. Thus, this assignment must satisfy constraint (3d). It follows that LP 3 must have objective value at least as large as the expected matching weight of the optimal offline algorithm. ∎
5.4 Analysis of the DPbased Greedy Algorithm
We will bound the performance of our greedy algorithm relative to the solution of Linear Program 3. Lemma 5.2 then implies that this bounds the competitive ratio. In particular, the following lemma, along with Lemma 5.2, implies Theorem 5.1.
Lemma 5.3.
For any bipartite graph , .
Proof.
For the sake of analysis, suppose we have solved LP 3 on the graph . Let
be the value of the objective function for and let
be the value “achieved” by a given online vertex with . Let be the optimal assignment given by LP (3), for the graph . So .
We will make the following charging argument. Imagine that when a vertex is matched to some , we assign to and for all (including itself) we assign to . Note we have assigned at most weight in total since due to LP constraint (3a).
Let for online vertex be equal to the weight of the offline vertex which is matched to or if is unmatched at the end of the arrivals. Let be the set of offline vertices which are matched at the end of the arrivals. We define
as the weight assigned to in our imaginary assignment. By the linearity of expectation
Thus, to complete the proof, we must show that
Consider an online vertex arriving at time . Let be the set of vertices available (unmatched) when arrives and be the set of vertices which are already matched when arrives. Note that when arrives, it has already been assigned a value of . After attempting to match to according to DP (2), we have assigned an expected value to of at least .
Thus, we have
∎
Lemma 5.3 now implies the main result, a competitive ratio.
5.5 A Upper Bound for Greedy Under Stochastic Rewards
In [27], Mehta and Panigrahi showed that in the unweighted Stochastic Rewards (patience of for online vertices) problem, any algorithm which is “opportunistic” achieves a competitive ratio of . As per [27], an opportunistic algorithm for the Stochastic Rewards setting is an algorithm which always attempts to probe an edge incident to an online arriving vertex if one exists. They show that any opportunistic algorithm achieves a competitive ratio of at least .
The most simple opportunistic algorithm is the one which, when arrives online, chooses a neighbor of arbitrarily and probes the edge . We call this algorithm “SimpleGreedy”. The result of [27] shows that SimpleGreedy achieves a competitive ratio of at least . Theorem 1.8, proven below, shows that this is tight even when restricted to small, uniform .
Proof.
Let be a fixed positive integer constant. Let , where and are disjoint, and . Let where and are disjoint, and . Let where and .^{1}^{1}1We use the notation Let .
For the bipartite graph , an offline algorithm can achieve a matching of expected size at least by first probing edges until all edges are probed or the maximum possible successful matches, , is achieved. This strategy achieves successful matches among these edges in expectation. Then, the offline optimal will probe all edges of in any order, achieving an expected number of successful matches of . The total expected size of the achieved matching is then .
On the other hand, an adversary in the online setting may expose all vertices of before any of the vertices of to the greedy algorithm. SimpleGreedy choosing arbitrarily may in the worst case choose to probe edges of first, preventing some vertices of from being matched later. We consider the case where SimpleGreedy chooses an edge for each online vertex if any is available at ’s arrival. We calculate the expected size of the matching produced by this strategy.
Let
be a random variable corresponding to the size of the final matching, and let
and be random variables corresponding to the number of matched vertices in and , respectively. Then, the expected size of the matching isWe now consider . If vertices of are matched successfully, then when the vertices of arrive online, there will only be vertices of remaining to be matched. Since , greedy will almost surely match all of them successfully. Thus, we get (as )
Finally, we observe that for large , , due to Stirling’s formula.
Thus, we get a competitive ratio of
(4) 
∎
With a more intricate argument, we can also show that the same upperbound of even holds for the “random greedy” algorithm, where an online vertex gets matched to a random available neighbor (if any).
6 Conclusion and future directions
For the problem with edge weights, stochastic rewards, patience constraints, and known IID arrivals (aka online matching with timeouts), our work suggests the current best ratio of [12] can be greatly improved. First, we have shown that the LP used in [12] has a large stochasticity gap suggesting a better LP or nonLPbased approach is needed. Second, we have shown that the vertexweighted version of this problem in the adversarial arrival model admits a ratio of and we conjecture that edgeweighted known IID arrival should be “easier” than vertexweighted adversarial arrival.
References
 [1] Adamczyk, M., Grandoni, F., and Mukherjee, J. Improved approximation algorithms for stochastic matching. In Algorithms  ESA 2015: 23rd Annual European Symposium, Patras, Greece, September 1416, 2015, Proceedings (Berlin, Heidelberg, 2015), N. Bansal and I. Finocchi, Eds., Springer Berlin Heidelberg, pp. 1–12.
 [2] Aggarwal, G., Goel, G., Karande, C., and Mehta, A. Online vertexweighted bipartite matching and singlebid budgeted allocations. In Proceedings of the twentysecond annual ACMSIAM symposium on Discrete Algorithms (2011), SIAM, pp. 1253–1264.
 [3] Alaei, S., Hajiaghayi, M., and Liaghat, V. Online prophetinequality matching with applications to ad allocation. Proceedings of the ACM Conference on Electronic Commerce (06 2012).
 [4] Bahmani, B., and Kapralov, M. Improved bounds for online stochastic matching. In European Symposium on Algorithms (ESA). Springer, 2010, pp. 170–181.
 [5] Bansal, N., Gupta, A., Li, J., Mestre, J., Nagarajan, V., and Rudra, A. When LP is the cure for your matching woes: Improved bounds for stochastic matchings. In Algorithms – ESA 2010: 18th Annual European Symposium, Liverpool, UK, September 68, 2010. Proceedings, Part II (Berlin, Heidelberg, 2010), Springer Berlin Heidelberg, pp. 218–229.
 [6] Baveja, A., Chavan, A., Nikiforov, A., Srinivasan, A., and Xu, P. Improved bounds in stochastic matching and optimization. In APPROXRANDOM 2015, LIPIcsLeibniz International Proceedings in Informatics (2015), vol. 40, Schloss DagstuhlLeibnizZentrum fuer Informatik.
 [7] Bollobas, B., and Brightwell, G. The width of random graph orders. The Mathematical Scientist 20 (01 1995), 69–90.
 [8] Borodin, A., Karavasilis, C., and Pankratov, D. An experimental study of algorithms for online bipartite matching, 2018.

[9]
Borodin, A., Karavasilis, C., and Pankratov, D.
Greedy Bipartite Matching in Random Type Poisson Arrival Model.
In
Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)
(Dagstuhl, Germany, 2018), E. Blais, K. Jansen, J. D. P. Rolim, and D. Steurer, Eds., vol. 116 of Leibniz International Proceedings in Informatics (LIPIcs), Schloss Dagstuhl–LeibnizZentrum fuer Informatik, pp. 5:1–5:15.  [10] Brubach, B., Sankararaman, K. A., Srinivasan, A., and Xu, P. New algorithms, better bounds, and a novel model for online stochastic matching. European Symposium on Algorithms (ESA) (2016).
 [11] Brubach, B., Sankararaman, K. A., Srinivasan, A., and Xu, P. New algorithms, better bounds, and a novel model for online stochastic matching. CoRR abs/1606.06395 (2016).
 [12] Brubach, B., Sankararaman, K. A., Srinivasan, A., and Xu, P. Attenuate locally, win globally: An attenuationbased framework for online stochastic matching with timeouts. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (2017), International Foundation for Autonomous Agents and Multiagent Systems, pp. 1223–1231.
 [13] Brubach, B., Sankararaman, K. A., Srinivasan, A., and Xu, P. Attenuate locally, win globally: An attenuationbased framework for online stochastic matching with timeouts. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (Richland, SC, 2017), AAMAS ’17, International Foundation for Autonomous Agents and Multiagent Systems, pp. 1223–1231.
 [14] Feldman, J., Mehta, A., Mirrokni, V., and Muthukrishnan, S. Online stochastic matching: Beating 11/e. In Foundations of Computer Science (FOCS) (2009), IEEE, pp. 117–126.
 [15] Goel, G., and Mehta, A. Online budgeted matching in random input models with applications to adwords. In Proceedings of the Nineteenth Annual ACMSIAM Symposium on Discrete Algorithms (Philadelphia, PA, USA, 2008), SODA ’08, Society for Industrial and Applied Mathematics, pp. 982–991.
 [16] Golrezaei, N., Nazerzadeh, H., and Rusmevichientong, P. Realtime optimization of personalized assortments. Management Science 60, 6 (2014), 1532–1551.
 [17] Goyal, V., and Udwani, R. Online Matching with Stochastic Rewards: Optimal Competitive Ratio via Path Based Formulation. CoRR (May 2019), arXiv:1905.12778.
 [18] Huang, Z., Tang, Z. G., Wu, X., and Zhang, Y. Online vertexweighted bipartite matching: Beating 11/e with random arrivals. CoRR abs/1804.07458 (2018).
 [19] Jaillet, P., and Lu, X. Online stochastic matching: New algorithms with better bounds. Mathematics of Operations Research 39, 3 (2013), 624–646.
 [20] Kalyanasundaram, B., and Pruhs, K. R. An optimal deterministic algorithm for online bmatching. Theoretical Computer Science 233, 1 (2000), 319–325.

[21]
Karp, R. M., Vazirani, U. V., and Vazirani, V. V.
An optimal algorithm for online bipartite matching.
In
Proceedings of the twentysecond annual ACM symposium on Theory of computing
(1990), ACM, pp. 352–358.  [22] Kesselheim, T., Radke, K., Tönnis, A., and Vöcking, B. An optimal online algorithm for weighted bipartite matching and extensions to combinatorial auctions. In European Symposium on Algorithms (ESA). Springer, 2013, pp. 589–600.
 [23] Mahdian, M., and Yan, Q. Online bipartite matching with random arrivals: an approach based on strongly factorrevealing LPs. In Proceedings of the fortythird annual ACM symposium on Theory of computing (2011), ACM, pp. 597–606.
 [24] Manshadi, V. H., Gharan, S. O., and Saberi, A. Online stochastic matching: Online actions based on offline statistics. Mathematics of Operations Research 37, 4 (2012), 559–573.
 [25] Mastin, A., and Jaillet, P. Greedy online bipartite matching on random graphs. CoRR abs/1307.2536 (2013).
 [26] Mehta, A. Online matching and ad allocation. Foundations and Trends in Theoretical Computer Science 8, 4 (2012), 265–368.
 [27] Mehta, A., and Panigrahi, D. Online matching with stochastic rewards. In Foundations of Computer Science (FOCS) (2012), IEEE, pp. 728–737.
 [28] Mehta, A., Waggoner, B., and Zadimoghaddam, M. Online stochastic matching with unequal probabilities. In Proceedings of the TwentySixth Annual ACMSIAM Symposium on Discrete Algorithms (2015), SIAM.
 [29] Meir, R., Chen, Y., and Feldman, M. Efficient parking allocation as online bipartite matching with posted prices. In Proceedings of the 2013 International Conference on Autonomous Agents and Multiagent Systems (Richland, SC, 2013), AAMAS ’13, International Foundation for Autonomous Agents and Multiagent Systems, pp. 303–310.
Comments
There are no comments yet.