# Improved Approximation Algorithms for Stochastic-Matching Problems

We consider the Stochastic Matching problem, which is motivated by applications in kidney exchange and online dating. In this problem, we are given an undirected graph. Each edge is assigned a known, independent probability of existence and a positive weight (or profit). We must probe an edge to discover whether or not it exists. Each node is assigned a positive integer called a timeout (or a patience). On this random graph we are executing a process, which probes the edges one-by-one and gradually constructs a matching. The process is constrained in two ways. First, if a probed edge exists, it must be added irrevocably to the matching (the query-commit model). Second, the timeout of a node v upper-bounds the number of edges incident to v that can be probed. The goal is to maximize the expected weight of the constructed matching. For this problem, Bansal et al. (Algorithmica 2012) provided a 0.33-approximation algorithm for bipartite graphs and a 0.25-approximation for general graphs. We improve the approximation factors to 0.39 and 0.269, respectively. The main technical ingredient in our result is a novel way of probing edges according to a not-uniformly-random permutation. Patching this method with an algorithm that works best for large-probability edges (plus additional ideas) leads to our improved approximation factors.

## Authors

• 4 publications
• 11 publications
• 17 publications
• 4 publications
• 20 publications
• 34 publications
06/13/2021

### Improved Guarantees for Offline Stochastic Matching via new Ordered Contention Resolution Schemes

Matching is one of the most fundamental and broadly applicable problems ...
02/04/2019

### (Near) Optimal Adaptivity Gaps for Stochastic Multi-Value Probing

Consider a kidney-exchange application where we want to find a max-match...
02/08/2021

### Prophet Inequality Matching Meets Probing with Commitment

Within the context of stochastic probing with commitment, we consider th...
06/12/2021

### Decentralized Matching in a Probabilistic Environment

We consider a model for repeated stochastic matching where compatibility...
09/27/2019

### Beating Greedy for Stochastic Bipartite Matching

We consider the maximum bipartite matching problem in stochastic setting...
02/16/2021

### Online matching in lossless expanders

Bauwens and Zimand [BZ 2019] have shown that lossless expanders have an ...
02/27/2020

### Stochastic Matching with Few Queries: (1-ε) Approximation

Suppose that we are given an arbitrary graph G=(V, E) and know that each...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Maximum-weight matching is a fundamental problem in combinatorial optimization and has applications in a wide-range of areas such as market design

[46, 1][18, 54], computational biology [52]

, and machine learning

[50]. The basic version of this problem is well-understood; exact polynomial-time algorithms are available for both bipartite and general graphs due to the celebrated results of [37] and [26], respectively. However, in many applications there are uncertainties associated with the input and typically, the problem of interest is more nuanced. A common approach to model this uncertainty is via randomness; we assume that we have a distribution over a collection of graphs. There are many such models ranging from stochastic edges [20, 12] to stochastic vertices [29, 14].

In this paper, we study the stochastic-matching model where edges in the graph are uncertain and the exact realization of the graph is obtained by probing the edges. This model has many applications in kidney exchange, online dating, and labor markets (see Subsection 1.3 for details). Further, we study a more general version of the problem, introduced by [19], where the algorithm is constrained by the number of probes it can make on the edges incident to any single vertex. This is used to model the notion of timeouts (also called patience) which naturally arises in many applications.

The formulation (described formally in subsection 1.1) is similar to many other well-studied stochastic-optimization problems such as stochastic knapsack [22], stochastic packing [8, 16], and stochastic shortest-path problems [44].

### 1.1 Definitions and Notation

In the stochastic-matching problem, we are given a graph , where denotes the set of vertices and denotes the set of potential edges. Additionally, we are given the following functions.

• associates every edge with an independent probability of existence, . When an edge is probed, it will exist with probability and must be added to the matching if it exists. Thus, we can only probe edges whose endpoints are currently unmatched.

• denotes a weight function that assigns a non-negative weight (or profit) to each edge .

• is the timeout (or patience) function which sets an upper bound on the number of times a vertex can have one of its incident edges probed.

An algorithm for this problem probes edges in a possibly adaptive order. When an edge is probed, it is present with probability (independent of all other edges), in which case it must be included in the matching under construction (query-commit model) and provides a weight of . We can probe at most edges among the set of edges incident to a node . Furthermore, when an edge is added to the matching, no edge (i.e., incident on ) can be probed in subsequent steps. Each edge may only be probed once. Our goal is to maximize the expected weight of the constructed matching.

A naive approach to solve this problem is to construct an exponential-sizedMarkov Decision Process (MDP) and solve it optimally using dynamic programming. However, there are no known algorithms to solve this exactly in polynomial time. In fact, the exact complexity of computing the optimal solution is unknown. The naive solution above is in ; it is unknown if the problem is in either or in . Thus, following prior works [19, 8, 3], we aim at finding an approximation to the optimal solution in polynomial time. To measure the performance of any algorithm, we use the standard notion of approximation ratio which is defined as follows.

###### Definition 1 (Approximation ratio)

For any instance , let denote the expected weight of the matching obtained by the algorithm on . Let denote the expected weight of the matching obtained by the optimal probing strategy. Then the approximation ratio of is defined as .

We remark that the measure used in [3] is the reciprocal of the ratio defined in Definition 1. Thus, the ratios in [3] are always greater than while the ratios in this paper are at most . Bansal et al. [8] provide an LP-based -approximation when is bipartite, and via a reduction to the bipartite case, a -approximation for general graphs (see also [4]).

For the related stochastic-matching problem without patience constraints (or equivalently, all equal infinity), the best-known algorithm achieves an approximation ratio of [30] and no algorithm can perform better than [20]. Since the problem without patience constraints is a special case of the problem studied here, the latter hardness result applies to our setting as well.

Chen et al. [19] formulated and initiated the study of the problem with patience constraints and gave a probing scheme that achieves an approximation ratio of in unweighted bipartite graphs. Later, Adamczyk [2] showed that the simple greedy algorithm (probe edges in non-increasing order of probability) achieves a 0.5-approximation for the unweighted case. This model was extended to weighted bipartite graphs by [8] who improved the ratio to . [3] provided a new algorithm that further improved this ratio to which is the current state-of-the-art. For general graphs, the current best approximation is  [10].

We note that our work and the most-recent prior work [8, 3]

use a natural linear program (LP) – see (LP-BIP) in Section

2 – to upper bound the optimal solution. However, it was shown in [13] that no algorithm can achieve an approximation better than using this LP even for the unweighted problem.

Online Stochastic Matching with Timeouts. We also consider the Online Stochastic Matching with Timeouts problem introduced in [8]. Here we are given as input a complete bipartite graph , where nodes in are buyer types and nodes in are items that we wish to sell. Like in the offline case, edges are labeled with probabilities and profits, and nodes are assigned timeouts. However, in this case timeouts on the item side are assumed to be unbounded. Then a second bipartite graph is constructed in an online fashion. Initially this graph consists of only. At each time step one random buyer of some type

is sampled (possibly with repetition) from a given probability distribution. The edges between

and are copies of the corresponding edges in . The online algorithm has to choose at most unmatched neighbors of , and probe those edges in some order until some edge turns out to be present (in which case is added to the matching and we gain the corresponding profit), or when all the mentioned edges have been probed. This process is repeated times (with a new buyer being sampled at each iteration), and our goal is to maximize the final total expected profit111As in [8], we assume that the probability of each buyer-type is an integer multiple of ..

For this problem, Bansal et al. [8] present a -approximation algorithm. In his Ph.D. thesis, Li [38] claims an improved -approximation. However, his analysis contains a mistake [39]. By fixing that, he still achieves a -approximation ratio improving over [8]. And although a corrected version of this result is not published later anywhere, in essence it would have follow the same lines as the result of Mukherjee [43] who independently obtained exactly the same approximation ratio.

### 1.2 Our Results

Our main result is an approximation algorithm for bipartite Stochastic Matching which improves the -approximation of Bansal et al. [8] (see Section 2).

###### Theorem 1.1

There is an expected -approximation algorithm for Stochastic Matching in bipartite graphs.

A -approximation, originally presented in  [3], can be obtained as follows. We build upon the algorithm in [8], which works as follows. After solving a proper LP and rounding the solution via a rounding technique from [31], Bansal et al. probe edges in uniform random order. Then they show that every edge is probed with probability at least , where is the fractional value of assigned by the LP, is the largest probability of any edge incident to ( excluded), and is a decreasing function with .

Our idea is to instead consider edges in a carefully chosen non-uniform random order. This way, we are able to show (with a slightly simpler analysis) that each edge is probed with probability . Observe that we have the same function as in [8], but depending on rather than . In particular, according to our analysis, small-probability edges are more likely to be probed than large-probability ones (for a given value of ), regardless of the probabilities of edges incident to . Though this approach alone does not directly imply an improved approximation factor, we further patch it with a greedy algorithm that behaves best for large-probability edges, and this yields an improved approximation ratio altogether. The greedy algorithm prioritizes large-probability edges for the same value of .

We further improve the approximation factor for the bipartite case with the above mentioned updated patching algorithm, hence improving the ratio of from [3] up to the current-best-known ratio of stated in Theorem 1.1.

We also improve on the -approximation for general graphs in [8] (see Section 3).

###### Theorem 1.2

There is an expected -approximation algorithm for Stochastic Matching in general graphs.

This is achieved by reducing the general case to the bipartite one as in prior work, but we also use a refined LP with blossom inequalities in order to fully exploit our large/small probability patching technique.

Similar arguments can also be successfully applied to the online case.

###### Theorem 1.3

There is an expected -approximation algorithm for Online Stochastic Matching with Timeouts.

By applying our idea of non-uniform permutation of edges we would get a -approximation (the same as in [38], after correcting the mentioned mistake). However, due to the way edges have to be probed in the online case, we are able to finely control the probability that an edge is probed via dumping factors. This allows us to improve the approximation from to . Our idea is similar in spirit to the one used by Ma [40] in his elegant -approximation algorithm for correlated non-preemptive stochastic knapsack. Further application of the large/small probability trick gives an extra improvement up to (see Section 4). We remark that since the publication of the conference version of this work, this has been improved to ([15]) and more recently to ([27]).

### 1.3 Applications

As previously mentioned, the stochastic matching problem is motivated from various applications such as kidney exchange and online dating. In this sub-section, we briefly consider these applications and show how we can use stochastic matching as a tool to solve these problems.

Kidney exchange in the United States. Kidney transplantation usually occurs from deceased donors; however, unlike other organs, another possibility is to obtain a kidney from a compatible

living donor since people need only one kidney to survive. There is a large waiting list of patients within the US who need a kidney for their survival. As of July 2019, the United Network for Organ Sharing (UNOS) estimates that the current number of patients who need a transplant is 113,265 with only 7,743 donors in the pool

222Data obtained from https://unos.org/data/transplant-trends/. One possibility is to enter the waitlist as a pair, where one person needs a kidney while the other person is willing to donate a kidney. Viewed as a stochastic-matching problem, the vertices are donor-patient pairs. An edge between two vertices exists, if the donor in is compatible with the patient in and the donor in is compatible with the patient in . The probability on the edge is the probability that the exchange will take place. Before every transplant takes place, elaborate medical tests are usually performed which is very expensive. More specifically, as described in [19], a test called the crossmatching is performed, that combines the recipient’s blood serum with some of donor’s red blood cells and checks if the antibodies in the serum kill the cells. This test is both expensive and time-consuming. Moreover, each exchange requires the transplant to happen simultaneously since organ donation within the United States is at will; donors are legally allowed to withdraw at any time including after agreeing for a donation. These constraints impose that for each donor-patient pair, the number of exchange initiations that can happen has to be small, which are modeled by the patience values at each vertex. Given the long wait-lists and the number of lives that depend on these exchanges, even small improvements to the accuracy of the algorithm have drastic effects on the well-being of the population. Prior works (e.g., [12] and references therein) have empirically applied the variants of the stochastic-matching problem to real-world datasets; in fact, the current model that runs the US-wide kidney exchange is based on a stochastic-matching algorithm (e.g., [24] and references therein).

Online dating. Online dating is quickly becoming the most popular form for couples to meet each other [45]. Platforms such as Tinder, eHarmony and Coffee meets Bagel generated about 1.7 billion USD in revenue in the year 2019. Suppose we have the case of a pool of heterosexual people, represented as the two vertex sets and in the bipartite graph. For every pair and , the system learns their compatibility based on the questions they answer. The goal of the online-dating platform is to suggest couples that maximizes the social welfare (i.e., the total number of matched couples). Each individual in a platform has a limited patience and thus, the system wants to ensure that the number of suggestions provided is small and limited. The stochastic-matching problem models this application where the probability on the edges represent the compatibility and the time-out function represents the individual patience.

Online labor markets. In online labor markets such as Mechanical Turk, the goal is to match workers to tasks [48]. Each worker-task pair has a probability of completing the task based on the worker’s skills and the complexity of the task. The goal of the platform is to match the pool of tasks to workers such that the (weighted) number of completed tasks is maximized. To keep workers in continued participation, the system needs to ensure that the worker is not matched with many tasks that they are incapable of handling. This once again fits in the model of the stochastic-matching problem where the workers and tasks represent the two sides of the bipartite graph, the edge-probability represents the probability that the task will be completed, and the time-out function represents the patience level of each worker.

### 1.4 Other Related Work

Stochastic-matching problems come in many flavors and there is a long line of research for each of these models. The literature on the broader stochastic combinatorial optimization is (even more) vast (see [51] for a survey) and here we only mention some representative works.

When the graph is unweighted and the time-out at every vertex is infinite, the classic algorithm of [35] gives an approximation ratio of for bipartite graphs; this in fact works even if the graph is unknown a priori. For general graphs, the work of [20] gives an algorithm that achieves a ratio of ; moreover, it shows that no algorithm can get a ratio better than for general graphs. The work of [42] gives an optimal algorithm in the special case of sparse graphs in this model. The paper [30] considers the weighted version of this problem in bipartite graphs and designs algorithms that achieves an approximation ratio of . (Recall that the timeouts are infinite in all of these works.)

The other line of research deals with the stochastic-matching problem where instead of a time-out constraint, the algorithm has to minimize the total number of queried edges. The work of [12] first proposed this model which was later considered and improved (by reducing the number of required queries) in many subsequent follow-up works including [5, 6, 11, 55].

Online variants of the stochastic matching problems have been extensively studied due to their applications in Internet advertising and other Internet based markets. The paper [29] introduced the problem of Online Matching with Stochastic Inputs. In this model, the vertices are drawn repeatedly and i.i.d. from a known distribution. The algorithm needs to find a match to a vertex each time one is presented, immediately and irrevocably. The goal is to maximize the expected weight of the matching compared to an algorithm that knows the sequence of realizations a priori. The work of [29] gave an algorithm that achieves a ratio of , which was subsequently improved by [41, 34, 14]. Later work extended further to fully-online models of matching where both partitions of the vertex set in the bipartite graph are sampled i.i.d. from a known distribution [25, 53]. The matching problem has also been studied in the two-stage stochastic-optimization model [36].

The stochastic-matching problem is also related to the broader stochastic-packing literature, where the algorithm only knows a probability distribution over the item costs, and once it commits to include the item sees a realization of the actual costs [21, 22, 8, 9, 10, 16]. Stochastic packing has also been studied in the online [28, 23, 17] and bandit [32, 7, 33] settings.

## 2 Stochastic Matching in Bipartite Graphs

In this section we present our improved approximation algorithm for Stochastic Matching in bipartite graphs. We start by presenting a simpler -approximation in Section 2.1, and then refine it in Section 2.2.

### 2.1 An Improved Approximation

In this section we prove the following result.

###### Theorem 2.1

There is an expected -approximation algorithm for Stochastic Matching in bipartite graphs.

Let denote an optimal probing strategy and let denote its expected value. Consider the following LP:

 max ∑e∈Ewepexe (LP-BIP) (1) s.t. ∑e∈δ(v)pexe≤1, ∀v∈V; (2) ∑e∈δ(v)xe≤tv, ∀v∈V; (3) 0≤xe≤1, ∀e∈E. (4)

The proof of the following Lemma is already quite standard [4, 8, 22] — just note that is a feasible solution of (LP-BIP).

###### Lemma 1

[8] Let be the optimal value of (LP-BIP). It holds that .

Our approach is similar to the one of Bansal et al. [8] (see also Algorithm 1 in the figure). We solve (LP-BIP): let be the optimal fractional solution. Then we apply to the rounding procedure by Gandhi et al. [31], which we shall call just GKPS. Let be the set of rounded edges, and let if and otherwise. GKPS guarantees the following properties of the rounded solution:

1. (Marginal distribution) For any ,

2. (Degree preservation) For any ,

3. (Negative correlation) For any , any subset of edges incident to , and any , it holds that

Our algorithm sorts the edges in according to a certain random permutation and probes each edge according to this order, but provided that the endpoints of are not matched already. It is important to notice that, by the degree-preservation property, has at most edges incident to each node . Hence, the timeout constraint of is respected even if the algorithm probes all the edges in

Our algorithm differs from [8] and subsequent work in the way edges are randomly ordered. Prior work exploits a uniformly-random order on . We rather use the following, more complex strategy. For each

we draw a random variable

distributed on the interval according to the following cumulative distribution: Observe that the density function of in this interval is (and zero otherwise). Edges of are sorted in increasing order of the ’s, and they are probed according to that order. We let

denote the vector

, wherein the elements of are ordered in some fixed manner.

Define . We say that an edge is safe if, at the time we consider for probing, no other edge has already been taken into the matching. Note that the algorithm can probe only in this case, and that if we do probe , it gets added to the matching with probability independent of all other events.

The main ingredient of our analysis is the following lower-bound on the probability that an arbitrary edge is safe.

###### Lemma 2

For every edge it holds that , where

 g(p):=12+p(1−exp(−(2+p)1pln11−p)).
###### Proof

In the worst case every edge that is before in the ordering can be probed, and each of these probes has to fail for to be safe. Thus

 Pr[e is safe∣∣e∈^E]≥E^E∖e,Y⎡⎢⎣∏f∈^δ(e):Yf

Now we take expectation on only, and using the fact that the variables are independent, we can write the latter expectation as

 E^E∖e⎡⎢⎣∫1peln11−pe0⎛⎜⎝∏f∈^δ(e)(Pr[Yf≤y](1−pf)+Pr[Yf>y])⎞⎟⎠e−pe⋅ydy∣∣ ∣∣e∈^E⎤⎥⎦. (5)

Observe that When , then , and moreover, is an increasing function of . Thus we can upper-bound by for any , and obtain that Thus (5) can be lower-bounded by

 E^E∖e[∫1peln11−pe0e−∑f∈^δ(e)pf⋅y−pe⋅ydy∣∣ ∣∣e∈^E] = E^E∖e[1∑f∈^δ(e)pf+pe(1−e−(∑f∈^δ(e)pf+pe)1peln11−pe)∣∣ ∣∣e∈^E].

We know from the negative-correlation and marginal-distribution properties that for every , and therefore , where the last inequality follows from the LP constraints. Consider function . This function is decreasing and convex. From Jensen’s inequality we know that . Thus

From Lemma 2 and the marginal distribution property, the expected contribution of edge to the profit of the solution is

 wepe⋅Pr[e∈^E]⋅Pr[e is safe∣∣e∈^E]≥wepexe⋅g(pe)≥wepexe⋅g(1)=13wepexe.

Therefore, our analysis implies a approximation, matching the result in [8]. However, by working with the probabilities appropriately, we can do better as described next.

#### Patching with Greedy.

We next describe an improved approximation algorithm, based on the patching of the above algorithm with a simple greedy one. Let be a parameter to be fixed later. We define as the set of (large) edges with , and let be the remaining (small) edges. Recall that denotes the optimal value of (LP-BIP). Let also and be the fraction of due to large and small edges, respectively, i.e., and . Define such that . By refining the above analysis, we obtain the following result.

###### Lemma 3

Algorithm 1 has an expected approximation ratio .

###### Proof

The expected profit of the algorithm is at least:

A greedy algorithm (). Consider the following greedy algorithm . Compute a maximum weight matching in with respect to edge weights , and probe the edges of in any order. Note that the timeout constraints are satisfied since we probe at most one edge incident to each node (and timeouts are strictly positive by definition and w.l.o.g.).

###### Lemma 4

has an expected approximation ratio of at least .

###### Proof

It is sufficient to show that the expected profit of the obtained solution is at least . Let be the optimal solution to (LP-BIP). Consider the solution that is obtained from by setting to zero all the variables corresponding to edges in , and by multiplying all the remaining variables by . Since for all , is a feasible fractional solution to the following matching LP:

 max ∑e∈Ewepeze (LP-MATCH) (6) s.t. ∑e∈δ(u)ze≤1, ∀u∈V; 0≤ze≤1, ∀e∈E. (7)

The value of in the above LP is by construction. Let be the optimal profit of (LP-MATCH). Then . Given that the graph is bipartite, (LP-MATCH) defines the matching polyhedron, and we can find an integral optimal solution to it. But such a solution is exactly a maximum weight matching according to weights , i.e. . The claim follows since the expected profit of the greedy algorithm is precisely the weight of .

A hybrid algorithm of and . The overall algorithm, denoted by is stated as follows. For a given , we simply compute the value of , and run if , and otherwise444Note that we cannot run both algorithms and take the better solution, due to the probe-commit constraint.

The approximation factor of is given by , and the worst case is achieved when the two quantities are equal, i.e., for , yielding an approximation ratio of . Now we just need to maximize this ratio. Since this function is quite complicated and so finding algebraically its maximum seems impossible, we need to compute it numerically. To avoid issues of numerical error, let us just notice that for , the ratio is approximately – which allows us to claim the ratio of from Theorem 2.1.

### 2.2 A Refined Approximation

We now describe the approach to achieve the -approximation ratio stated in Theorem 1.1. The main algorithm, denoted by , consists of two sub-routines. One is as described in Algorithm 1 in Section 2.1. The other is a new patching algorithm, denoted by , which is described in Algorithm 2.

Let be the optimal solutions to (LP-BIP). Recall the definition of function from Lemma 2:

 g(p):=12+p(1−exp(−2+ppln11−p)).

From the same Lemma 2 we know that the expected total weight achieved by is

 E[ALG1]≥∑e∈Ewepexe⋅g(pe).

Consider now vector defined as . Notice that vector is a feasible solution to (LP-MATCH). And in Algorithm 2 we can see that it is that guides .

We shall prove in a moment that the expected outcome of

is

 E[ALG2]=∑e∈Ewep2exe.

Our main algorithm, , is formally stated in Algorithm 3.

To prove Theorem 1.1, it remains to bound the expected outcome of . We then use lowerbounds on and to bound the approximation achieved by . We now lower bound the profit of in the following lemma.

###### Lemma 5

The total expected weight of the matching obtained by is where denotes the optimal solution to (LP-BIP).

###### Proof

Recall that is a feasible solution to (LP-MATCH): so, the use of the GKPS dependent-rounding procedure on the polytope from (LP-MATCH) is allowed. First, from property (P1) of , the probability that an edge has is . Second, from property (P2) and the fact that , we have that the subgraph induced by the edges with has at most one edge incident to any vertex . Thus, given that an edge has , it is guaranteed to be probed and the probability that it is eventually chosen into the matching is its probability of existing . Putting these two facts together, the probability that any edge is included in the final matching is . Using the linearity of expectation, we obtain that the total expected weight of the matching is .

We now have all the ingredients to prove Theorem 1.1.

###### Proof (of Theorem 1.1)

Algorithm chooses either or depending on the maximum of the lowerbounds of and . Hence to find its worst case behaviour we have to characterize an instance that minimizes

 max{∑e∈Ewepexe⋅g(pe),∑e∈Ewep2exe}.

Let denote the set of edges such that . Let . Notice that the quantity is just the value of (LP-BIP), which is our upperbound on . The outcome of is thus . And the value of (LP-MATCH) induced by is , which is at the same time the outcome of . Thus, the adversary wants to minimize the following mathematical program.

 minimizemax{∫1q=0g(q)⋅qhqdq,∫1q=0q2hq dq}such that∫1q=0qhq dq=1 (8)

The normalization constraint ensures that the optimal value to the adversarial program (8) is the approximation ratio for the algorithm .

Such a mathematical program may be hard to solve in full generality, due to the fact that the variable over which we optimize is in fact a (not necessarily continuous) probability distribution. However, in our case we can characterize the optimal solution, i.e., function such that , algebraically. Let us next formulate our problem in a more compact way where we define :

 minf{max(∫1q=0g(q)⋅f(q)dq,∫1q=0q⋅f(q)dq) s.t.∫1q=0f(q)dq=1}. (9)

Since is a concave function, it is point-wise at least as large as a linear function for all . Hence, the minimum of (9) is at least as large as the minimum of the following program:

 minf max(∫1q=0((1−q)g(0)+q⋅g(1))⋅f(q)dq,∫1q=0q⋅f(q)dq) (10) s.t. ∫1q=0f(q)dq=1.

Here we have two linear functions, i.e., and . This allows us to simplify it further:

 ∫1q=0((1−q)g(0)+q⋅g(1))⋅f(q)dq=g(0)+(g(1)−g(0))∫1q=0q⋅f(q)dq.

Even though the variable in program (10) is a density function, we can consider the whole integral as a single real variable from , and the program simplifies to

 minα max(g(0)+(g(1)−g(0))⋅α,α) s.t. α∈[0,1].

Since function is increasing and function is decreasing, the minimum is obtained for for which . This yields a value of such that

 α=g(0)1+g(0)−g(1)=12(1−e−2)1+12(1−e−2)−13=3e2−37e2−3≈0.39338739.

Solution which is a number has to be translated into solution of program (10) which is a density function. This is however straightforward: is a density function which places mass on point 1, and mass on point 0. At the same time is a solution to the initial program (9). And since the value of program (9) for such an is also equal we conclude that it is the actual minimal value of it.

## 3 Stochastic Matching in General Graphs

In this section, we present our improved approximation algorithm for Stochastic Matching in general graphs as stated in Theorem 1.2.

We consider the linear program LP-GEN which is obtained from LP-BIP by adding the following blossom inequalities:

 ∑e∈E(W)pexe≤|W|−12 ∀W⊆V,|W|. (11)

Here is the subset of edges with both endpoints in . We remark that, using standard tools from matching theory, we can solve LP-GEN in polynomial time despite its exponential number of constraints [47]. Also, in this case, is a feasible solution of LP-GEN, hence the analogue of Lemma 1 still holds.

Our stochastic-matching algorithm for the case of a general graph works via a reduction to the bipartite case. First we solve LP-GEN; let be the optimal fractional solution. Second we randomly split the nodes into two sets and , with being the set of edges between them. On the bipartite graph we apply the algorithm for the bipartite case, but using the fractional solution induced by LP-GEN rather than solving LP-BIP. Note that is a feasible solution to LP-BIP for the bipartite graph .

The analysis differs only in two points w.r.t. the one for the bipartite case. First, with being the subset of edges of that were rounded to 1, we have now that . Second, but for the same reason, using again the negative correlation and marginal distribution properties, we have

Repeating the steps of the proof of Lemma 2 and including the above inequality we get the following.

###### Lemma 6

For every edge it holds that , where

 h(p):=11+p(1−exp(−(1+p)1pln11−p)).

Since , we directly obtain a -approximation which matches the result in [8]. Similarly to the bipartite case, we can patch this result with the simple greedy algorithm (which is exactly the same in the general graph case). For a given parameter , let us define analogously to the bipartite case. Similarly to the proof of Lemma 3, one obtains that the above algorithm has approximation factor . Similarly to the proof of Lemma 4, the greedy algorithm has approximation ratio (here we exploit the blossom inequalities that guarantee the integrality of the matching polyhedron). We can conclude similarly that in the worst case , yielding an approximation ratio of . Maximizing (numerically) this function over gives, for , the approximation ratio claimed in Theorem 1.2.

## 4 Online Stochastic Matching with Timeouts

Let be the input graph, with items and buyer types . We use the same notation for edge probabilities, edge profits, and timeouts as in Stochastic Matching. Following [8], we can assume w.l.o.g. that each buyer type is sampled uniformly with probability . Consider the following linear program:

 max ∑a∈A,b∈Bwabpabxab (LP-ONL) s.t. ∑b∈Bpabxab≤1, ∀a∈A ∑a∈Apabxab≤1, ∀b∈B ∑a∈Axab≤tb, ∀b∈B 0≤xab≤1, ∀ab∈E.

The above LP models a bipartite stochastic-matching instance where one side of the bipartition contains exactly one buyer per buyer type. In contrast, in the online case, several buyers of the same buyer type (or none at all) can arrive, and the optimal strategy can allow many buyers of the same type to probe edges. This is not a problem though, since the following lemma from [8] allows us just to look at the graph of buyer types and not at the actual realized buyers.

###### Lemma 7

([8], Lemmas 9 and 11) Let be the expected profit of the optimal online algorithm for the problem. Let be the optimal value of LP-ONL. It holds that .

We will devise an algorithm whose expected outcome is at least , and then Theorem 1.3 follows from Lemma 7.

#### The algorithm.

We initially solve LP-ONL and let be the optimal fractional solution. Then buyers arrive. When a buyer of type is sampled, then: (a) if a buyer of the same type was already sampled before we simply discard her, do nothing, and wait for another buyer to arrive, and (b) if it is the first buyer of type , then we execute the following subroutine for buyers. Since we take action only when the first buyer of type comes, we shall denote such a buyer simply by , as it will not cause any confusion.

Let us consider the step of the online algorithm in which the first buyer of type arrived, if any. Let be the items that are still available when arrives. Our subroutine will probe a subset of at most edges , . Consider the vector . Observe that it satisfies the constraints and . Again using GKPS, we round this vector in order to get with , and satisfying the marginal distribution, degree preservation, and negative correlation properties555In this case, we have a bipartite graph where one side has only one vertex, and here GKPS reduces to Srinivasan’s rounding procedure for level-sets [49].. Let be the set of items such that . For each , , we independently draw a random variable with distribution: for . Let .

Next we consider items of in increasing order of . Let be a dumping factor that we will define later. With probability we probe edge and as usual we stop the process (of probing edges incident to ) if is present. Otherwise (with probability ) we simulate the probe of , meaning that with probability we stop the process anyway — like if edge were probed and turned out to be present. Note that we do not get any profit from the latter simulation since we do not really probe .

#### Dumping factors.

It remains to define the dumping factors. For a given edge , let

 βab:=E^Ab∖a,Y⎡⎢⎣∏a′∈Ab:Ya′b

Using the inequality , by repeating the analysis from Section 2 we can show that

 βab≥h(pab)=11+pab(1−exp(−(1+pab)1pabln11−pab))≥12.

Let us assume for the sake of simplicity that we are able to compute exactly. We set . Note that is well defined since .

#### Analysis.

Let us denote by the event that at least one buyer of type arrives. The probability that an edge is probed can be expressed as:

 Pr[Ab]⋅Pr[no b′ % takes a before b∣∣Ab]⋅Pr[b % probes a∣∣Ab∧a is not yet taken].

The probability that arrives is . We shall show first that

 Pr[b probes a∣∣Ab∧a is not yet % taken]

is exactly , and later we shall show that is at least . This will yield that the probability that is probed is at least

 (1−1e)11+12(1−1e)⋅12xab=e−13e−1xab>0.24xab.

Consider the probability that some edge appearing before in the random order blocks edge , meaning that is not probed because of . Observe that each such is indeed considered for probing in the online model, and the probability that blocks is therefore . We can conclude that the probability that is not blocked is exactly .

Due to the dumping factor , the probability that we actually probe edge is exactly . Recall that by the marginal distribution property. Altogether

 Pr[b probes a∣∣Ab∧a is not yet % taken]=12xab. (12)

Next let us condition on the event that buyer arrived and lower-bound the probability that is not blocked on the ’s side in such a step, i.e., that no other buyer has taken already. The buyers, who are first occurrences of their type, arrive uniformly at random. Therefore, we can analyze the process of their arrivals as if it was constructed by the following procedure: every buyer is given an independent random variable distributed exponentially on , i.e., ; buyers arrive in increasing order of their variables . Once buyer arrives, it probes edge with probability (exactly) — these probabilities are independent among different buyers. Thus, conditioning on the fact that arrives, we obtain the following expression for the probability that is safe at the moment when arrives:

 Pr[no b′ takes a before b∣∣Ab] ≥ E⎡⎢⎣∏b′∈B∖b:Yb′

Now let us upper-bound each of the probability factors in the above product. First of all . Second, just by definition666The </