DeepAI

# Max-Weight Online Stochastic Matching: Improved Approximations Against the Online Benchmark

In this paper, we study max-weight stochastic matchings on online bipartite graphs under both vertex and edge arrivals. We focus on designing polynomial time approximation algorithms with respect to the online benchmark, which was first considered by Papadimitriou, Pollner, Saberi, and Wajc [EC'21]. In the vertex arrival version of the problem, the goal is to find an approximate max-weight matching of a given bipartite graph when the vertices in one part of the graph arrive online in a fixed order with independent chances of failure. Whenever a vertex arrives we should decide, irrevocably, whether to match it with one of its unmatched neighbors or leave it unmatched forever. There has been a long line of work designing approximation algorithms for different variants of this problem with respect to the offline benchmark (prophet). Papadimitriou et al., however, propose the alternative online benchmark and show that considering this new benchmark allows them to improve the 0.5 approximation ratio, which is the best ratio achievable with respect to the offline benchmark. They provide a 0.51-approximation algorithm which was later improved to 0.526 by Saberi and Wajc [ICALP'21]. The main contribution of this paper is designing a simple algorithm with a significantly improved approximation ratio of (1-1/e) for this problem. We also consider the edge arrival version in which, instead of vertices, edges of the graph arrive in an online fashion with independent chances of failure. Designing approximation algorithms for this problem has also been studied extensively with the best approximation ratio being 0.337 with respect to the offline benchmark. This paper, however, is the first to consider the online benchmark for the edge arrival version of the problem. For this problem, we provide a simple algorithm with an approximation ratio of 0.5 with respect to the online benchmark.

• 24 publications
• 12 publications
• 3 publications
04/14/2022

### (Fractional) Online Stochastic Matching via Fine-Grained Offline Statistics

Motivated by display advertising on the internet, the online stochastic ...
03/24/2021

### Online Stochastic Matching, Poisson Arrivals, and the Natural Linear Program

We study the online stochastic matching problem. Consider a bipartite gr...
11/03/2020

### Secretary Matching with General Arrivals

We provide online algorithms for secretary matching in general weighted ...
02/20/2021

### Online Stochastic Max-Weight Bipartite Matching: Beyond Prophet Inequalities

The rich literature on online Bayesian selection problems has long focus...
06/13/2021

### Improved Guarantees for Offline Stochastic Matching via new Ordered Contention Resolution Schemes

Matching is one of the most fundamental and broadly applicable problems ...
01/19/2019

### Approximation Algorithms for the A Priori TravelingRepairman

We consider the a priori traveling repairman problem, which is a stochas...
10/17/2018

### Algorithmic Blockchain Channel Design

Payment networks, also known as channels, are a most promising solution ...

## 1 Introduction

The extensive literature on online Bayesian selection algorithms mainly focuses on the competitive ratio. That is, how well the algorithm performs against the optimal offline solution. Competing against such a strong benchmark often leads to pessimistic outcomes. For example, it is well-known that even for the single item version of the online Bayesian selection problem, the prophet inequality problem, no online algorithm can be better than -competitive.

Another natural objective would be to compete with the best online solution. For many variants of online Bayesian selection problems (when the input is generated stochastically) one can write a dynamic program that makes the best decision at any point — hence the objective function is well defined. However, these algorithms are rarely computationally efficient. Indeed, Papadimitriou, Pollner, Saberi, and Wajc [DBLP:conf/sigecom/PapadimitriouPS21] show that a variant of the online stochastic matching problem is PSPACE-hard to approximate within some small constant. Thus, they initiate studying approximation algorithms for this problem with respect to the online benchmark. That is the solution found by an algorithm that has unlimited computational power, but is unaware of the part of the input that has not arrived.

In this paper, we study max-weight stochastic matchings on online bipartite graphs under both vertex and edge arrivals. Our main focus is on designing polynomial time approximation algorithms with respect to the online benchmark.

#### The Vertex arrival model.

The goal in this problem is to find a large-weight matching of a bipartite graph when vertices in one part of the graph are online, arriving in a fixed order, each with an independent chance of failure. The vertices in the other part are present from the beginning thus referred to as the offline vertices. The graph, the arrival order of the online vertices, and their chances of failure are known from the beginning. The only unknown is whether a vertex actually arrives or if it fails. If a vertex does not arrive (i.e., fails), we do nothing about it. Otherwise, we either match it irrevocably to one of its unmatched neighbors or leave it unmatched forever. Papadimitriou et al. [DBLP:conf/sigecom/PapadimitriouPS21] refer to this problem as the RideHail problem due to its applications in ride hailing. However, it also models scenarios in other types of matching markets such as labor markets, online advertising, etc.

There has been a long line of work designing approximation algorithms for this problem (and its variants) with respect to the offline benchmark (see [DBLP:conf/stoc/KarpVV90, DBLP:conf/sigecom/AlaeiHL12, DBLP:conf/focs/GamlathKMSW19, DBLP:journals/geb/KleinbergW19, DBLP:conf/sigecom/EzraFGT20, DBLP:conf/sigecom/PapadimitriouPS21] and the references within.) The best known algorithm with respect to this benchmark achieves a tight approximation ratio [DBLP:conf/soda/FeldmanGL15]. In their recent work, Papadimitriou et al. [DBLP:conf/sigecom/PapadimitriouPS21] show that this ratio can be improved to if one considers the online benchmark instead. The online benchmark here is defined as a max-weight matching found by an algorithm that has unlimited computational power but does not know the arrival/failure of the future vertices. This approximation ratio was later improved to using a machinery developed by Saberi and Wajc [DBLP:conf/icalp/SaberiW21] for an online edge coloring problem. In this work, we design a simple algorithm with a significantly improved approximation ratio of with respect to the online benchmark.

[width=enhanced, frame hidden, boxsep=5pt, left=1pt, right=1pt, top=2pt, bottom=2pt, boxrule=1pt, arc=0pt, colback=mylightgray, colframe=black, breakable ] Result 1. (See Theorem 1) There exists a polynomial time algorithm for the online bipartite stochastic matching problem under (one-sided) vertex arrivals which finds a matching of weight at least a fraction of the one found by the best online algorithm.

Similar to Papadimitriou et al. [DBLP:conf/sigecom/PapadimitriouPS21], we also consider a more general version of the problem, which is also studied by [DBLP:conf/soda/FeldmanGL15, DBLP:conf/sigecom/EzraFGT20, DBLP:journals/siamcomp/DuttingFKL20]

, where upon arrival of a vertex, weights of its edges are drawn from a known joint distribution. However, weights of edges incident to different online vertices are still independent. In Section

3.6 we explain how our algorithm and analysis can be extended to get the same approximation ratio of for this more general problem.

#### The Edge arrival model.

The only difference between this problem and the vertex arrival version is that here, instead of the vertices, edges are online. Similarly, the goal in this problem is to find a large-weight matching of a bipartite graph when edges are online, arriving in a fixed order, each with an independent chance of failure. The graph, the arrival order of the edges and their chances of failure are known from the beginning. The only unknown is whether an edge actually arrives or if it fails. If an edge does not arrive (i.e., fails), we do nothing about it. Otherwise, we decide irrevocably whether to add it to our matching or not. See [DBLP:conf/icalp/GravinTW21] for potential applications of this problem.

Designing approximation algorithms for the edge arrival version of the problem has also been studied extensively (see [DBLP:journals/algorithmica/BuchbinderST19, DBLP:conf/ec/GravinW19, DBLP:conf/sigecom/EzraFGT20, DBLP:journals/siamcomp/FeldmanSZ21] and the references within), with the best approximation ratio being with respect to the offline benchmark [DBLP:conf/sigecom/EzraFGT20]. It is also known that with respect to this benchmark it is not possible to achieve an approximation ratio better than  [DBLP:conf/ec/GravinW19]. This paper, however, is the first to consider the online benchmark for the edge arrival version of the problem. For this problem, we provide a simple algorithm with an approximation ratio of with respect to the online benchmark.

[width=enhanced, frame hidden, boxsep=5pt, left=1pt, right=1pt, top=2pt, bottom=2pt, boxrule=1pt, arc=0pt, colback=mylightgray, colframe=black, breakable ] Result 2. (See Theorem 2) There exists a polynomial time algorithm for the online bipartite stochastic matching problem under edge arrivals which finds a matching of weight at least a fraction of the one found by the best online algorithm.

### 1.1 Our Techniques

For both vertex and edge arrival versions of the problem, we design LP-based algorithms consisting of an LP and a rounding procedure. The LP which we borrow from [torrico2017dynamic] outputs a fractional solution , where for any edge ,

can be interpreted as the probability of this edge joining the matching. Papadimitriou et al.

[DBLP:conf/sigecom/PapadimitriouPS21] also use the same LP and furtur give a lower-bound of for its integrality gap. Other than the basic matching constraints, the LP has an additional natural constraint which is crucial for separating the online and offline solutions. This constraint relies on the fact that whether a vertex/edge fails or not is independent of any decision made by the algorithm for the vertices/edges arriving before that. Thus, for instance, in the vertex arrival version, this constraint states that for any edge , the offline vertex should be unmatched with probability at least before the arrival of vertex . Here, denotes the probability of not failing (i.e. arriving). We explain this in more detail in Section 3.1. In this section, we focus on discussing our rounding procedure for the vertex arrival model to present the flavor of our work.

#### Our rounding procedure.

To round the solution of the LP, we design a simple random rounding procedure. Upon arrival of a vertex , it may receive matching proposals from its unmatched neighbors. If it receives any proposals, it accepts the best one (the one with the largest weight) and joins the matching. Otherwise, it remains unmatched. The process of sending proposals is as follows: When arrives, any of its unmatched neighbors decides independently at random whether to send a proposal. The probability of sending these proposals is set in a way that for any edge, the probability of it being proposed is lower-bounded by . This is achievable in particular due to the LP constraint discussed above.

To be able to highlight the properties of our algorithm which allow us to improve the algorithm proposed by Papadimitriou et al. [DBLP:conf/sigecom/PapadimitriouPS21], we will give a brief overview of their algorithm below.

#### Algorithm proposed by Papadimitriou et al.

Let us emphasize that this is just a brief and paraphrased overview of the algorithm proposed in [DBLP:conf/sigecom/PapadimitriouPS21] which we include for the sake of comparison. They start with the same LP that we use. However, their rounding procedure is different. Upon arrival of an online vertex , it picks one of its neighbors randomly proportional to the probabilities given by the LP and sends a proposal to it. If the neighbor is unmatched it accepts the proposal with some probability and matches with . These probabilities are set in a way that each edge joins the matching with probability at least . However, some vertices are not able to satisfy this for their edges by sending only a single proposal. The algorithm gives such vertices a second chance to send another proposal if their first proposal is not accepted, and this allows them to guarantee a matching probability of for all the edges. In the rest of the paper, we will refer to this algorithm as PPSW (the authors’ initials.)

As the first difference, in PPSW the offline vertices receive the proposals and need to decide whether to accept a proposal without knowing their future proposals. In our algorithm however, since the online vertices are the ones receiving the proposals they can make a decision while knowing all their options. This is particularly helpful since edges are weighted and an online vertex has the option of picking the one with the highest weight. With this advantage, however, comes a new challenge. We cannot guarantee that all the edges will join the matching with a large probability. Indeed, some low-weight edges may have a very small probability. To overcome this, instead of analyzing the rounding loss for any edge, we lower-bound the loss imposed on any online vertex due to the rounding procedure.

The second difference is in the number of proposals a vertex can send. We do not limit the number of proposals a vertex can send. Indeed, a vertex can send proposals as long as it is unmatched. PPSW on the other hand, imposes the limit of two proposal per any online vertex. This is due to the way their analysis works. The main meat of their analysis is upper-bounding the positive correlation between the matching status of the offline vertices, and allowing the vertices to send many proposals worsens the correlation. Let us first explain why absence of positive correlation is desirable. A key part of our analysis is proving that our algorithm satisfies the following property: whenever an online vertex arrives, the probability of it not receiving any proposals from any subset of its neighbors is at most

 ∏u∈S(1−x(v,u)/pv). (1)

As mentioned above, the probability of receiving a proposal from any of its neighbors is at least . Since arrives with probability , the probability of it not receiving a proposal from a neighbor is at most . Thus, property (1) follows directly if we were allowed to assume that the matching status of the vertices in are independent when arrives. It is not complicated to show that the same holds if they are not positively correlated. Unfortunately though, trying to prove this key property through positive correlation fails as we show via an example (See Section 3.5.) that these events can indeed be negatively correlated. However, we are still able to prove the existence of this property via a different method without concerning ourselves with the correlation between the matching status of the offline vertices. We even show that our analysis is tight for an instance of the problem. Interestingly, in this instance our algorithm does not cause any positive correlation between the matching status of the offline vertices. (See Section 3.4.) This all means that positive correlation by itself is not the enemy. It only hurts us if it decreases the probability of a vertex receiving at least one proposal in comparison to the case of its neighbors being independent. Our approach is to carefully lower-bound the probability of this event using simple mathematical tools. See Lemma 3.4 for more details.

### 1.2 Further Related work

As we mentioned before, most of the literature on online Bayesian selection focuses on designing algorithms with respect to the offline benchmark which is often referred to as the prophet. It would be impossible to do justice to this extensive literature in this amount of space, thus we just briefly outline some of the most relevant works here. The study of prophet inequality problem, the single item version of the online Bayesian selection problem, was initiated by Krengel and Sucheston [krengel1978semiamarts] who give an algorithm with competitive ratio of . Their seminal work was a starting point for studying more general versions of online Bayesian selection problems. The ones most related to multi-item prophet inequalities under matroid constraints [DBLP:journals/geb/KleinbergW19], stochastic matching under vertex arrivals [DBLP:conf/stoc/KarpVV90, DBLP:conf/sigecom/AlaeiHL12, DBLP:conf/focs/GamlathKMSW19, DBLP:journals/geb/KleinbergW19, DBLP:conf/sigecom/EzraFGT20, DBLP:conf/sigecom/PapadimitriouPS21] and stochastic matching under edge arrivals [DBLP:journals/algorithmica/BuchbinderST19, DBLP:conf/ec/GravinW19, DBLP:conf/sigecom/EzraFGT20, DBLP:journals/siamcomp/FeldmanSZ21].

It is worth mentioning that the connection between prophet inequalities and algorithmic mechanism design first discovered by Hajiaghayi, Kleinberg and Sandholm [DBLP:conf/aaai/HajiaghayiKS07], has played a significant role in motivating the study of approximation algorithms for these online Bayesian selection problems. For more detailed related work on the topic of prophet inequality and also its relations to algorithmic mechanism design see  [DBLP:journals/sigecom/Lucier17, DBLP:journals/sigecom/CorreaFHOV18].

Max-weight matching on stochastic graphs has also been studied extensively under the query model. That is, similar to our model, there is an underlying stochastic graph, however, to know whether an edge exists it should be queried. Some works focus on having a small number of queries [DBLP:conf/soda/YamaguchiM18, DBLP:conf/sigecom/BehnezhadR18, DBLP:conf/stoc/BehnezhadDH20, DBLP:conf/focs/BehnezhadD20] while others require any queried and realized edge to join the matching [DBLP:conf/icalp/ChenIKMR09, DBLP:conf/icalp/CostelloTT12, DBLP:journals/algorithmica/BansalGLMNR12, DBLP:conf/soda/GamlathKS19]. The main application of these models is for environments with costly queries such as organ exchange markets.

#### Paper organization.

The rest of the paper is organized as follows. In Section 2, we provide a formal definition of our problems and some notions that we will use throughout the paper. Section 3 is about the vertex arrival model. In this section, we first present the algorithm and its analysis in 3.1 and  3.3 respectively. Later, in 3.4, we provide an example for which our algorithm causes positive correlation between the matching status of the offline vertices, and in 3.5 we show that the analysis of our algorithm is tight. Further, in 3.6 we explain how our results can be extended to a more general version of the problem where the weights of the edges connected to any online vertex are drawn from a joint distribution. Finally, we discuss the edge arrival version of the problem in Section 4 with the algorithm and its analysis being in 4.1 and 4.3 respectively.

## 2 Preliminaries

We are given a bipartite graph and a weight for each edge . In the vertex arrival model, we also have a probability for each and a fixed order over the vertices in . Vertices in are initially present, but vertices in arrive online. At any time , with probability vertex arrives (or is realized). If it does, we are allowed to match irrevocably to one of its unmatched neighbors or else commit to leaving it unmatched forever. If it does not arrive, we do nothing at time . In the edge arrival model, similarly, we are also given a probability for each and a fixed order over the edges. At any time , with probability edge arrives (or is realized). If it does, we should decide irrevocably whether to add it to our matching or lose it forever. Under both arrival models, the goal is to maximize the total weight of the edges we add to the matching.

In this paper, our focus is on designing a polynomial time algorithms for the above problem under both arrival models. We say that an algorithm is an -approximation if for any instance of the problem, it satisfies

 E[A(I)]≥αE[\textscOPTon(I)],

where is the optimal online algorithm. That is an algorithm that has unlimited computational power, but its knowledge about the arrival of future vertices/edges is the same as ours.

For ease of notation, for any pair of edges , , we say if arrives before . We also use to mean arrives before time . (Note that in the vertex arrival model, an edge arrives whenever its online end-point arrives.) Also, when it is clear from context, we will use to refer to the vertex (or edge ), arriving at time

. Finally, we write our edges as ordered pairs, meaning that for any edge

we always have and .

## 3 Vertex Arrivals

### 3.1 The Algorithm

We begin by writing a linear program that attempts to model the optimal (omnipotent) online algorithm’s behavior. This LP is also used by Papadimitriou et al., however, for the sake of self-containment we explain it in detail here. For any edge

, we have a variable which represents the probability of joining the matching in . Here, the randomness can be over both the stochastic arrivals of the vertices and any random decisions made by the algorithm. We claim then that such are feasible for the following LP:

 maxx ∑e∈Ewexe, (2) s.t. ∑e∋vxe≤pv ∀v∈A, (3) ∑e∋uxe≤1 ∀u∈B, (4) pv⋅(1−∑\mathclape′∋u,e′

The first two constraints (4 and 3) are standard matching constraints since each vertex can be incident to at most one edge in the matching, and each is unmatched with probability at least (when it fails). Constraint 5 is however special to the online solution. It asserts that for any edge the probability of being unmatched before time (the left-hand side) is at least . This is due to the fact that arrival of vertex is independent of whether is matched before time . (All vertices arrive independently, and a non-omniscient algorithm must have made all matching decisions independently of future arrivals.) If this constraint is not satisfied then is unmatched with probability less that . In this case, the probability that arrives and is still unmatched by time is less than , contradicting the definition of .

###### Observation 3.1 ([torrico2017dynamic]).

Let be an optimal solution of the LP. We have where .

### 3.2 The Rounding Procedure

The next step of the algorithm is rounding the fractional solution of the LP. For that, we design a simple rounding procedure that given any optimal solution of the LP (which is a fractional matching), outputs an integral matching. Later we will prove that the output of this algorithm is a -approximate solution.

Our rounding procedure is very simple and natural. Whenever a vertex arrives, we construct a random subset of its unmatched neighbors as potential matches. Any unmatched neighbors decide independently at random whether to send a matching proposal to and join subset . If receives at least one proposal (i.e., if is nonempty), it accepts the best one (the one with the largest weight) and joins the matching. Otherwise, it remains unmatched forever. In our algorithm, we set the probability of sending proposals in a way that for any edge , it results in

 Pr[vt receives a proposal from u]≥xe.

This is achievable thanks to the constraint 5 of the LP which separates the online and offline benchmarks. In other words, it is not possible to satisfy this inequality for any arbitrary fractional matching, and this is where we use the fact that we are competing with the best online algorithm. To be able to satisfy this property, whenever arrives and is unmatched we need to send a proposal to with probability at least

 xeptPr[u is unmactched].

This is of course achievable only if this number is not larger than one, which will be shown as a consequence of our analysis.

[width=enhanced, boxsep=2pt, left=1pt, right=1pt, top=4pt, boxrule=1pt, arc=0pt, colback=white, colframe=black, unbreakable ]Algorithm 1. Rounding Procedure

1:  Let be an optimal solution of the LP.
2:  Let be a matching of .
3:  for  do
4:
5:     Let set denote neighbors of vertex in graph .
6:     .
7:     For any vertex , define
8:     For any vertex , if is matched in and , then with probability add edge to set independently.
9:     if  is non-empty and is realized then
10:        Add edge to matching .
11:     end if
12:  end for
13:  Return matching .

### 3.3 The Analysis

The purpose of this section is proving that Algorithm 3.2 finds a -approximate matching. Before proceeding with our analysis, we need to define some notations. In the rest of the paper, we use to represent the weighted matching outputted by Algorithm 3.2. Moreover, for any vertex , if it is matched in , (i.e., ), we use to represent the weight of its matching edge in . Note that and

are both random variables. For any vertex

and any time , we define

 αt,u:=∑e∋u,e<(vt,u)xe.

Finally, for a given subset of vertices , we define to be the event in which all the vertices from are matched before time , and to similarly be the event that all vertices in are still free (unmatched) just before time .

In our calculations, we will make use of the following lemma. However, to preserve the flow of the paper, we defer its proof to Section 5.

###### Lemma 3.2.

Let be a set of vertices. Suppose we associate each vertex with a number . Then

We can now begin our analysis with a crucial property of our algorithm. That is upper-bounding the probability of all the vertices in being matched before time for any .

###### Lemma 3.3.

At any time , for any subset of vertices , we have

 Pr[ESt]≤∏u∈Sαt,u. (7)
###### Proof.

We use proof by induction on . Our claim holds for the base case of as for , both sides of Equation 7 equal to zero for nonempty (and one for ). Assuming that for some , this equation holds, we will prove it for . In other words, we will prove

 Pr[ESt+1]≤∏u∈S(αt,u+x(t,u)), (8)

for any . For now, we assume that

 αt+1,u=αt,u+x(vt,u)<1 (9)

holds for all . For such , we may define

 βt,u=x(vt,u)1−αt,u∈[0,1\rparen. (10)

We aim to show that satisfies Equation 8. For to occur, either all of was matched already before time , or some vertex was matched exactly at time , with the others matched before. This lets us compute

 Pr[ESt+1] (11) =∏u∈Sx(vt,u)+αt,u(1−αt,u−x(vt,u))1−αt,u =∏u∈S(x(vt,u)+αt,u),

which is exactly Equation 8.

Before we complete our proof, we must still consider where, for at least some , the inequality from Equation 9 is violated. Let denote the vertices satisfying Equation 9. Then

 Pr[ESt+1]≤Pr[ES′t+1]≤∏u∈S′αt+1,u≤∏u∈Sαt+1,u,

since for any , we have (and all other are non-negative). ∎

###### Lemma 3.4.

In Algorithm 3.2, at any time , for any subset of vertices , the probability that none of the vertices in joins is upper-bounded by

 ∏u∈S(1−x(t,u)/pt).
###### Proof.

Pick some . In order for none of the vertices from to join , then each that is still unmatched must fail to be sampled with probability , using as defined in the proof of Lemma 3.3. We can thus bound this probability to be at most

 (Lemma 3.2) (Lemma 3.3) =∏u∈S(αt,uβt,u/pt+1−βt,u/pt) =∏u∈Sαt,ux(t,u)/pt+(1−αt,u)−x(t,u)/pt1−αt,u =∏u∈S(1−x(t,u)/pt).\qed
###### Lemma 3.5.

For any vertex , and any non-negative number we have:

 Pr[WM(vt)≥w]≥(1−1/e)∑\mathclape∋vt,we≥wxe. (12)
###### Proof.

Fix some and . Let denote the set of all such that . Then as long as some vertex from is added to at time , and vertex arrives, then will hold. Note that the arrival of is independent of , so we can compute

 =pt−pt∏u∈S(1−x(t,u)/pt) (Lemma 3.4) (AM-GM) (convexity) =(1−1/e)∑u∈Sx(t,u).\qed
###### Theorem 1.

Algorithm 3.2 outputs a -approximate matching, that is

 ∑vt∈AE[WM(vt)]≥(1−1/e)⋅\textscOPTon.
###### Proof.

By Observation 3.1 we know that gives us an upper-bound for , that is:

 ∑e∈Ewexe=∑vt∈A∑e∋vtwexe≥\textscOPTon.

As a result, to prove this theorem, it suffices to show that for any vertex we have

 E[WM(vt)]≥(1−1/e)∑e∋vtwexe,

which is the same as proving

 (13)

By definition, for any vertex we can write the left-hand side of this inequality as

 ∑e∋vtwexe−E[WM(vt)] (Lemma 3.5 )

This proves Equation 13 and concludes the proof of the theorem. ∎

### 3.4 Positive Correlation

Much of the detail needed in our proof of Theorem 1 and related lemmas is due to handling potential correlation between the matched/unmatched status of the vertices in . In particular, the proof of our main lemma (Lemma 3.4) could proceed fairly directly if we were allowed to assume that events , …, (the events that vertices in are matched before any time ), are independent from each other. Similarly, if we had the notion of negative dependence used in [DBLP:conf/sigecom/PapadimitriouPS21], namely negative association of the indicator variables for the events , this would also suffice to arrive at Lemma 3.4. In this section, we will show that a more involved analysis such as ours is in fact necessary since our algorithm sometimes causes positive correlation between these events. We construct a bipartite graph with and , such that before time the events and are in fact positively correlated. The edge set, along with the values of and for and are given in the following diagram:

 1/2 v1 1/4 v2 ε v3 u1 u2 1/2 1/8 >0 1/8 >0

One can easily verify that our solution satisfies the LP constraints, and is optimal for certain values of (in particular, when , and when and , and small weights incident to ).

With probability , the first vertex arrives, and is matched with with probability . Assuming this occurs, matches if it arrives (with probability ) and is added to in the second step (with probability ).

The other of the time, the first vertex does not arrive, so is added to in the second step with probability when arrives with probability , and no edges are matched otherwise. Overall, after time , both and are matched with probability , neither is matched with probability , and just is matched with probability . The indicator variables for the events and will thus have positive covariance

### 3.5 Tightness of the Analysis

First, we show that our algorithm indeed loses the factor of compared to . We construct the graph , where and . For each , there are edges and with weights and respectively. The vertices from arrive in order , and we have for all and . Then, the unique optimal solution to our LP is given in the following diagram (namely, and ).

 v1 v2 vn v∗ u1 u2 un … … 1−1/n 1−1/n 1−1/n 1/n 1/n 1/n

Consider what our algorithm would do faced with this graph. For each , it would add to with probability , then add to our matching if is realized. Hence, the probability that is unmatched by the time we get to vertex is exactly , and is independent of all other vertices from . The probability that will have no neighbors unmatched is thus .

We can now bound our algorithms expected matching weight to be at most

which for large gets arbitrarily close to . On the other hand, a trivial online algorithm could instead never match any of the edges , and always take one of the edges , obtaining a matching with weight always. The optimal online algorithm, , would thus need to attain at least in expectation, proving that our algorithm can never be -competitive for any .

Note that in this example, for any vertex at the time of its arrival, the matching status of its neighbors are independent. This intuitively means that the loss our algorithm incurs is not due to the correlation it causes between the matching status of the vertices.

### 3.6 Generalization of the Algorithm and Analysis

With our analysis complete, we can now extend our algorithm to a more general version of the vertex arrival model, allowing for distributions over edge weights. We will first describe the new model, which Papadimitriou, Pollner, Saberi, and Wajc also considered for their algorithm [DBLP:conf/sigecom/PapadimitriouPS21]. We will then explain the main considerations for adapting our algorithm and analysis from Sections 3.13.3 for this harder case.

#### General Vertex Arrival Model

Just as in the original vertex arrival model (described in Section 2), we have a known bipartite graph and fixed order over the vertices in . Vertices in are initially present, but vertices in arrive online in this order. However, rather than having fixed weights for all edges, with each vertex arriving with a probability , we instead realize a sample from a distribution over possible weights for all edges incident to .

Formally, for each time , there is a joint distribution with finite support over non-negative assignments of weights for all edges incident to . At time , we draw a sample . This tells us the realized weight for each edge incident to . As before, we may now choose to match irrevocably to one of its unmatched neighbours. The goal is to maximize the total realized weight of all edges in our matching, given by

 ∑(vt,u)∈Mwt(vt,u),

where denotes the set of edges in our final matching.

We note that this is indeed a generalization of our original vertex arrival model, which can be represented here by

yielding the vector of values

with probability , and the zero vector with probability .

#### Modified Algorithm

To begin, we modify our LP from Section 3.1 to yield another LP relaxation under this more general model, using the same natural extension as given in [DBLP:conf/sigecom/PapadimitriouPS21].

Since our distributions are assumed to be finite, for each , we let denote the probability mass for the -th possibility (and varies from to the size of the support of ), and define to be the weight assigned to edge in this case. We will have variables representing the probability of being matched to with value in . These take the place of from before (representing the probability of being in our matching), so we can now give our modified LP:

 maxy ∑(vt,u)∈E,iwt,i,u⋅yt,i,u, (14) s.t. ∑(vt,u)∈Eyt,i,u≤pt,i ∀vt∈A,i, (15) ∑t,iyt,i,u≤1 ∀u∈B, (16) pt,i⋅(1−∑t′

The actual rounding procedure of Algorithm 3.2 also needs modification. At each iteration of the loop, we will first sample , obtaining some possibility . We can define

 αu=∑t′

instead at Line 7. We will now add each vertex to independently with probability instead of at Line 8. This is again easily seen to be well-defined, by Equation 17 from our modified LP. The maximum weight sampled neighbour will then be chosen based on the sampled weights .

#### Modified Analysis

Our analysis remains largely valid, and applies to this more general model with minor modifications.

To start, must be defined as in Equation 19. This allows the proof of Lemma 3.3 to go through as written, replacing occurrences of with . We must also more carefully expand the probability at Equation 11, observing that

Lemma 3.4 needs slight adjustment to its statement. We can instead show that, for any , if we assume that the -th possible weight vector is drawn from , so the realized weight of is for all , then the probability that no vertex from joins is upper-bounded by

 ∏u∈S(1−yt,i,u/pt,i).

The proof now still holds, replacing all occurrences of with , and replacing occurrences of with .

Lemma 3.5 can be modified similarly, again conditioning on the realization of , and making the same substitutions, defining as all where for the assumed realization . Finally, to extend Theorem 1 to this more general model, by taking expectations over the drawing of , it suffices to show for any time and realization that

 E[WM(vt)∣wtu=wt,i,u]≥(1−1/e)∑(vt,u)∈Ewt,i,uyt,i,u.

Again, the proof carries out similarly to before, using our conditional version of Lemma 3.5, and replacing occurrences of and with and , respectively.

## 4 Edge Arrivals

### 4.1 The Algorithm

Similar to the vertex arrival version, we start with an LP for the online problem, and use its solution to build our matching. Again, for each edge , we have the variable represent the probability of joining the matching in .

 maxx ∑e∈Ewexe, (20) s.t. ∑e∋uxe≤1 ∀u∈A∪B, (21) pe⋅(1−∑\mathclape′∋v,e′

We would again like to assert that any corresponding to the execution of yields a valid solution to this LP. Constraint 21 is as before.

We now consider Constraint 22. In order for to add to the matching, it cannot have matched any edge to already. This occurs with probability exactly , by definition of , and since the corresponding events are disjoint. Finally, since being realized is independent from all previous realizations (and any randomness used by the algorithm), the probability that has not been matched and is realized is given by the left-hand side of Constraint 22, and so the bound must follow. Constraint 23 is similar, and we obtain an observation analogous to Observation 3.1.

###### Observation 4.1.

Let be an optimal solution of the LP. We have where .

### 4.2 The Rounding Procedure

We give our online rounding procedure in Algorithm 4.2. Here, we think of the vertices as again making proposals to their neighbours with some probability (based on ), as long as the corresponding edge is realized. Then, must decide if it accepts a proposal online. This is as opposed to the vertex arrival model, where knew all its proposals upon arrival. Since the graph is weighted, simply accepting the first proposal may result in a significant loss. To resolve this issue, our algorithm is designed in a way that each edge joins the final matching with probability exactly . In this sense, our algorithm resembles the one designed by Ezra et al. [DBLP:conf/sigecom/EzraFGT20] for the vertex arrival version of the problem.

Before stating the algorithm formally, we give a brief overview. The algorithm starts with all the vertices marked as alive, but as the algorithm proceeds it marks some of them as dead. Vertices in only die when they are matched. However, we sometimes mark a vertex in as dead without it being matched. At any time , when edge arrives, the algorithm needs to decide whether to add this edge to the matching. If is alive at this point, independent of the status of , it randomly (with a probability set in Line 8 of the algorithm) decides whether to send a proposal to . The probability of this event is set in a way that the probability of ever receiving a proposal from is equal to . If is alive, it decides randomly (with a probability set in Line 10 of the algorithm) whether to accept the proposal. If a match happens, we mark as dead to ensure that we do not match it again in the future. However, vertex dies iff it send a proposal regardless of the proposal being accepted. This serves two purposes. First, to ensure that its future edges are not matched with a probability higher than . Second, to ensure that alive/dead status of the vertices in are independent of each other throughout the algorithm.

[width=enhanced, boxsep=2pt, left=1pt, right=1pt, top=4pt, boxrule=1pt, arc=0pt, colback=white, colframe=black, unbreakable ]Algorithm 2. Rounding Procedure

1:  Let be an optimal solution of the LP.
2:  Let be a matching of .
3:  Mark all the vertices in as alive.
4:  for  do
5:     Let where and .
6:     Define
7:     Define
8:     Let be a Bernoulli random variable which is equal to one with probability .
9:     if  is alive, is realized, and  then
10:        If is also alive, then with probability add to and mark as dead.
11:        Mark as dead.
12:     end if
13:  end for
14:  Return matching .

We note that this algorithm necessarily returns a valid matching since whenever we add an edge to , we also mark both and as dead (and will never again add any of their incident edges to ). Otherwise, everything is well-defined (notably, ) by the LP constraints.

### 4.3 The Analysis

The first half of our analysis will focus on showing that the proposals arriving at a given vertex are well-behaved. To begin, we show that a vertex proposes to with probability exactly .

###### Lemma 4.2.

On any iteration of Algorithm 4.2, the probability that the condition at Line 9 holds is .

###### Proof.

We prove this by strong induction for a given vertex . Fix , and suppose this holds for all . That is, for every , the probability that the condition at Line 9 holds (that is, the probability that proposes to ) is . Then, defining as computed at Line 6, the probability that is dead at the start of iteration is exactly , since is marked dead as soon as it makes its first (and thus only) proposal.

Whether is realized and whether at iteration both occur independently of what has occurred so far, and with probabilities and respectively. Thus, the probability that all three conditions from Line 9, and that proposes to , is exactly . ∎

Next, we observe that for a fixed , the proposals received from its neighbors are independent.

###### Lemma 4.3.

For an edge , let denote the event that in iteration , the condition at Line 9 holds. Then for any , the events are independent.

###### Proof.

Let . We observe that depends only on randomness from iteration (whether was realized, the value of ), as well as whether is alive or not. However, the aliveness of itself depends on these same variables from the previous edge incident to processed by the algorithm (or is deterministically alive if is the first such edge). Thus, inducting over all such edges, whether is alive or not depends only on realizations of edges and values of from iterations where for some . Importantly, is a deterministic function of these random inputs, which are importantly disjoint from for other . ∎

Now that we know that the proposals are well-behaved, we can prove our main result for edge arrivals.

###### Theorem 2.

Algorithm 4.2 outputs a -approximate matching, that is

 ∑e∈EweE[WM(e)]≥\textscOPTon/2.
###### Proof.

We have already noted in Section 4.2 that Algorithm 4.2 outputs a correct matching. It thus suffices to prove that each edge is added to with probability , by Observation 4.1. For a given , we prove this by induction over all edges incident to , in order of arrival.

Fix some where . Suppose that any with is added with probability . Then, at time , the probability that is already marked dead (equivalently, that an edge incident to has been added to ) is exactly , as these are disjoint events. By Lemma 4.3, the proposals to were independent, so even conditioned on being still alive, the probability that receives a proposal from is as given by Lemma 4.2, namely . Thus, the probability that is alive and proposed to by , and is then added to , is

 (1−αv/2)⋅xe⋅12−αv=xe/2.\qed

## 5 Proof of Lemma 3.2

###### Proof.

We begin by considering the events and . Throughout, we assume a single fixed , and drop it from the subscripts. First, for any , the set of events over partition the probability space. In particular, we get the identity

 =1.

Even better, this same identity holds when we condition all probabilities by an arbitrary event, so for we have

 (25)

by conditioning on .

Letting and be arbitrary, we now get

 (Equation 25)