1 Introduction
In bipartite matching problems, agents on one side of a market are paired with agents, contracts, or transactions on the other. Classical matching problems—assigning students to schools, papers to reviewers, or medical residents to hospitals—take place in a static setting, where all agents exist at the time of matching, are simultaneously matched, and then the market concludes. In contrast, many matching problems are dynamic, where one side of the market arrives in an online fashion and is matched sequentially to the other side.
Online bipartite matching problems are primarily motivated by Internet advertising. In the basic version of the problem, we are given a bipartite graph where represents the offline vertices (advertisers) and represents the online vertices (keywords or impressions). There is an edge if advertiser bids for a keyword . When a keyword arrives, a central clearinghouse must make an instant and irrevocable decision to either reject or assign to one of its “neighbors” (set of incident edges) and obtain a profit for the match . When an advertiser is matched, it is no longer available for matches with other keywords (in the most basic case) or its budget is reduced. The goal is to design an efficient online algorithm such that the expected total weight (profit) of the matching obtained is maximized. Following the seminal work of kvv kvv, there has been a large body of research on related variants (overviewed by mehta2012online mehta2012online). One particular flavor of problems is online matching with known identical independent distributions () [Feldman et al.2009, Haeupler, Mirrokni, and Zadimoghaddam2011, Manshadi, Gharan, and Saberi2012, Jaillet and Lu2013, Brubach et al.2016]. In this flavor, agents arrive over rounds, and their arrival distributions are assumed to be identical and independent over all rounds; additionally, this distribution is known to the algorithm beforehand.
Apart from the Internet advertising application, online bipartite matching models have been used to capture a wide range of online resource allocation and scheduling problems. Typically we have an offline and an online party representing, respectively, the service providers (SP) and online users; once an online user arrives, we need to match it to an offline SP immediately. In many cases, the service is reusable in the sense that once an SP is matched to a user, it will be gone for some time, but will then rejoin the system afterwards. Besides that, in many real settings the arrival distributions of online users do change from time to time (they are not ). Consider the following motivational examples.
Taxi Dispatching Services and RideSharing Systems. Traditional taxi services and rideshare systems like Uber and Didi Chuxing match drivers to wouldbe riders [Tong et al.2016a, Lowalekar, Varakantham, and Jaillet2016, Lee et al.2004, Seow, Dang, and Lee2010]. Here, the offline SPs are different vehicle drivers. Once an online request (potential rider) arrives, the system matches it to a nearby driver instantly such that the rider’s waiting time is minimized. In most cases, the driver will rejoin the system and can be matched again once she finishes the service. Additionally, the arrival rates of requests changes dramatically across the day. Consider the online arrivals during the peak hours and offpeak hours for example: the arrival rates in the former case can be much larger than the latter.
Organ Allocation. Chronic kidney disease affects tens of millions of people worldwide at great societal and monetary cost [Neuen et al.2013, Saran et al.2015]. Organ donation—either via a deceased or living donor—is a lifesaving alternative to organ failure. In the case of kidneys, a donor organ can last up to years in a patient before failing again. Various nationwide organ donation systems exist and operate under different ethical and logistical constraints [Bertsimas, Farias, and Trichakis2013, Dickerson and Sandholm2015, Mattei, Saffidine, and Walsh2017], but all share a common online structure: the offline party is the set of patients (who reappear every to years based on donor organ longevity), and the online party is the set of donors or donor organs, who arrive over time.
Similar scenarios can be seen in other areas such as wireless network connection management (SPs are different wireless access points) [Yiu et al.2008] and online cloud computing service scheduling [Miller2008, Younge et al.2010]. Inspired by the above applications, we generalize the model of in the following two ways.
Reusable Resources. Once we assign to , will rejoin the system after rounds with , where
is an integral random variable with known distribution. In this paper, we call
the occupation time of w.r.t. . In fact, we show that our setting can directly be extended to the case when is time sensitive: when matching to at time , will rejoin the system after rounds. This extension makes our model adaptive to nuances in realworld settings. For example, consider the taxi dispatching or ridesharing service: the occupation time of a driver from a matching with an online user does depend on both the user type of (such as destination) and the time when the matching occurs (peak hours can differ significantly from offpeak hours).Known Adversarial Distributions (). Suppose we have rounds and that for each round ^{1}^{1}1Throughout this paper, we use to denote the set , for any positive integer ., a vertex is sampled from according to an arbitrary known distribution where the marginal for is such that for all . Also, the arrivals at different times are independent (and according to these given distributions). The setting of KAD was introduced by [Alaei, Hajiaghayi, and Liaghat2012, Alaei, Hajiaghayi, and Liaghat2013] and is known as Prophet Inequality matching.
We call our new model Online Matching with (offline) Reusable Resources under Known Adversarial Distributions (, henceforth). Note that the model can be viewed as a special case when is a constant (with respect to ) and are the same for all .
Competitive Ratio. Let denote the expected value obtained by an algorithm on an input and arrival distribution . Let denote the expected offline optimal, which refers to the optimal solution when we are allowed to make choices after observing the entire sequence of online arrival vertices. Then, competitive ratio is defined as . It is a common technique to use an optimal value to upper bound the (called the benchmark ) and hence get a valid lower bound on the resulting competitive ratio.
1.1 Our Contributions
First, we propose a new model of to capture a wide range of realworld applications related to online scheduling, organ allocation, rideshare dispatch, among others. We claim that this model is tractable enough to obtain good algorithms with theoretically provable guarantees and general enough to capture many reallife instances. Our model assumptions take a significant step forward from the usual assumptions in the online matching literature where the offline side is assumed to be singleuse or disposable. This leads to a larger range of potential applications which can be modeled by online matching.
Second, we show how this model can be cleanly
analyzed under a theoretical framework. We first construct a linear program (henceforth) (
1) which we show is a valid upperbound on the expected offline optimal (note the latter is hard to characterize). Next, we propose an efficient algorithm that achieves a competitive ratio of for any given . This algorithm solves the and obtains an optimal fractional solution. It uses this optimal solution as a guide in the online phase. Using MonteCarlo simulations (called simulations henceforth), and combining with this optimal solution, our algorithm makes the online decisions. In particular, Theorem 1.1 describes our first theoretical results formally.(1) is a valid benchmark for . There exists an online algorithm, based on (1), achieving an online competitive ratio of for any given .
Third, we show that our simple algorithm is nearly optimal among all nonadaptive algorithms. We show that no nonadaptive algorithm can achieve a competitive ratio better than if using (1) as the benchmark. Formally, Theorem 1.1 states this result.
No nonadaptive algorithm, based on benchmark (1), can achieve a competitive ratio better than ^{2}^{2}2 is a vanishing term when both of and are sufficiently large. even when all are constants.
Finally, through a datadriven analysis on a massive openly available dataset we show that our model is robust enough to capture the setting of taxi hailing/sharing at least. Additionally, we provide certain simpler heuristics which also give good performance. Hence, we can combine these theoretically grounded algorithms with such heuristics to obtain further improved ratios in practice. Section 5 provides a detailed qualitative and quantitative discussion.
1.2 Other Related Work
In addition to the arrival assumptions of KIID and KAD, there are several other important, wellstudied variants of online matching problems. Under adversarial ordering, an adversary can arrange the arrival order of all items in an arbitrary way (e.g., online matching [Karp, Vazirani, and Vazirani1990, Sun, Zhang, and Zhang2016] and AdWords [Buchbinder, Jain, and Naor2007, Mehta et al.2007]). Under a random arrival order, all items arrive in a random permutation order (e.g., online matching [Mahdian and Yan2011] and AdWords [Goel and Mehta2008]). Finally, under unknown distributions, in each round, an item is sampled from a fixed but unknown distribution. (e.g., [Devanur et al.2011]). For each of the above categories, we list only a few examples considered under that setting. For a more complete list, please refer to the book by mehta2012online mehta2012online.
Despite the fact that our model is inspired by online bipartite matching, it also overlaps with stochastic online scheduling problems () [Megow, Uetz, and Vredeveld2004, Megow, Uetz, and Vredeveld2006, Skutella, Sviridenko, and Uetz2016]. We first restate our model in the language of : we have nonidentical parallel machines and jobs; at every timestep a single job is sampled from
with probability
; the jobs have to be assigned immediately after its arrival; additionally each job can be processed nonpreemptively on a specific subset of machines; once we assign to , we get a profit of and will be occupied for rounds with , where is a random variable with known distribution. Observe that the key difference between our model and is in the objective: the former is to maximize the expected profit from the completed jobs, while the latter is to minimize the total or the maximum completion time of all jobs.2 Main Model
In this section, we present a formal statement of our main model. Suppose we have a bipartite graph where and represent offline and online parties respectively. We have a finite time horizon (known beforehand) and for each time , a vertex will be sampled (we use the term arrives
) from a known probability distribution
such that ^{3}^{3}3Thus, with probability , none of the vertices from will arrive at . (noting that such a choice is made independently for each round ). The expected number of times arrives across the rounds, is called the arrival rate for vertex . Once a vertex arrives, we need to make an irrevocable decision immediately: either to reject or assign to one of its neighbors in . For each , once it is assigned to some , it becomes unavailable for rounds with , and subsequently rejoins the system. Here is an integral random variable taking values from and the distribution is known in advance. Each assignment is associated with a weight and our goal is to design an online assignment policy such that the total expected weights of all assignments made is maximized. Following prior work, we assume and . Throughout this paper, we use edge and assignment of to interchangeably.For an assignment , let be the probability that is chosen at in any offline optimal algorithm. For each (likewise for ), let () be the set of neighboring edges incident to (). We use the LP (1) as a benchmark to upper bound the offline optimal. We now interpret the constraints. For each round , once an online vertex arrives, we can assign it to at most one of its neighbors. Thus, we have: if arrives at , the total number of assignments for at is at most 1; if does not arrive, the total is 0. The LHS of (2) is exactly the expected number of assignments made at for . It should be no more than the prob. that arrives at , which is the RHS of (2). Constraint (3) is the most novel part of our problem formulation. Consider a given and . In the LHS, the first term (summation over and ) refers to the prob. that is not available at while the second term (summation over ) is the prob. that is assigned to some worker at , which is no larger than prob. is available at . Thus, the sum of the first term and second term on LHS is no larger than 1.^{4}^{4}4We would like to point out that our LP constraint (3) on is inspired by ma2014improvements ma2014improvements. The proof is similar to that by alaei2012online alaei2012online and alaei2013online alaei2013online. This argument implies that the forms a valid upperbound on the offline optimal solution and hence we have Lemma 2.
!h
maximize  (1)  
subject to  (2)  
(3)  
(4) 
The optimal value to LP (1) is a valid upper bound for the offline optimal.
3 Simulationbased Algorithm
In this section, we present a simulationbased algorithm. Proofs for Lemma 3, 4 and 4 can be found in the supplementary material. The main idea is as follows. Let denote an optimal solution to LP (1). Suppose we aim to develop an online algorithm achieving a ratio of . Consider an assignment when some arrived at time . Let be the event that is safe at , is available at . By simulating the current strategy up to
, we can get an estimation of
, say , within an arbitrary small error. Therefore in the case is safe at , we can sample it with probability , which leads to the fact that is sampled with probability unconditionally. Hence, we call any algorithm that satisfies as valid.The simulationbased attenuation technique has been used previously for other problems, such as stochastic knapsack [Ma2014] and stochastic matching [Adamczyk, Grandoni, and Mukherjee2015]. Throughout the analysis, we assume that we know the exact value of for all and . (It is easy to see that the sampling error can be folded into a multiplicative factor of in the competitive ratio by standard Chernoff bounds and hence, ignoring it leads to a cleaner presentation). The formal statement of our algorithm, denoted by , is as follows. For each and , let be the set of safe assignments for at .
is valid with .
Extension from to . Consider the case when the occupation time of from is sensitive to . In other words, each will be unavailable for rounds from the assignment at . We can accommodate the extension by simply updating the constraints (3) on in the benchmark LP (1) to the following. We have that ,
(5) 
The rest of our algorithm remains the same as before. We can verify that (1) LP (1) with constraints (3) replaced by (5) is a valid benchmark; (2) achieves a competitive ratio of for any given for the new model based on the new valid benchmark LP. The modifications to the analysis transfer through in a straightforward way and for brevity we omit the details here.
4 Hardness Result
Consider a complete bipartite graph of where , . Suppose we have rounds and for each and . In other words, in each round , each is sampled uniformly from . For each , let be a constant of , which implies that each will be unavailable for a constant rounds after each assignment. Assume all assignments have a uniform weight ( for all ). Split the whole online process of rounds into consecutive windows such that for each . The benchmark LP (1) then reduces to the following.
max  (6)  
s.t.  (7)  
(8)  
(9) 
We can verify that an optimal solution to the above LP is as follows: for all and with the optimal objective value of . We investigate the performance of any optimal nonadaptive algorithm. Notice that the expected arrivals of any in the full sequence of online arrivals is . Thus for any nonadaptive algorithm , it needs to specify the allocation distribution for each during the first arrival. Consider a given parameterized by for each and such that for each . In other words, will assign to with probability when comes for the first time and is available.
Let , which is the probability that is matched in each round if it is safe at the beginning of that round, when running . Hence,
Consider a given with and let be the probability that is available at . Then the expected number of matches of after the rounds is . We have the recursive inequalities on as in Lemma 4, with .
, we have
Note that the of our benchmark LP is while the performance of is . The resulting competitive ratio achieved by an optimal is captured by the following maximization program.
(10) 
We prove the following Lemma which implies Theorem 1.1.
The optimal value to the program (10) is at most when .
Unconditional Hardness. manshadi2012online manshadi2012online prove that for the online matching problem under known distribution (but disposable offline vertices), no algorithm can achieve a ratio better than . Since our setting generalizes this, the hardness results directly apply to our problem as well.
5 Experiments
To validate the approaches presented in this paper, we use the New York City yellow cabs dataset,^{5}^{5}5http://www.andresmh.com/nyctaxitrips/ which contains the trip records for trips in Manhattan, Brooklyn, and Queens for the year . The dataset is split into months. For each month we have numerous records each corresponding to a single trip. Each record has the following structure. We have an anonymized license number which is the primary key corresponding to a car. For privacy purposes a long string is used as opposed to the actual license number. We then have the time at which the trip was initiated, the time at which the trip ended, and the total time of the trip in seconds. This is followed by the starting coordinates (latitude and longitude) of the trip and the destination coordinates of the trip.
Assumptions. We make two assumptions specific to our experimental setup. Firstly, we assume that every car starts and ends at the same location, for all trips that it makes. Initially, we assign every car a location (potentially the same) which corresponds to its docking position. On receiving a request, the car leaves from this docking position to the point of pickup, executes the trip and returns to this docking position. Secondly, we assume that occupation time distributions (OTD) associated with all matches are identically (and independently) distributed, follow the same distribution. Note that this is a much stronger assumption than what we made in the model, and is completely inspired by the dataset (see Section 5.2). We test our model on two specific distributions, namely a normal distribution and the powerlaw distribution (see Figure 6). The docking position of each car and parameters associated with each distribution are all learned from the training dataset (described below in the Training discussion).
5.1 Experimental Setup
For our experimental setup, we randomly select cabs (each cab is denoted by ). We discretize the Manhattan map into cells such that each cell is approximately miles (increments of degrees in latitude and longitude). For each pair of locations, say , we create a request type , which represents all trips with starting and ending locations falling into and respectively. In our model, we have and
(variations depending on day to day requests with low variance). We focus on the month of January
. We split the records into parts, each corresponding to a day of January. We choose a random set of parts for training purposes and use the remaining for testing purposes.The edge weight on (edge from a car to type ) is set as a function of two distances in our setup. The first is the trip distance (the distance from the starting location to the ending location of , denoted ) while the second is the docking distance (the distance from the docking position of to the starting/ending location of , denoted ). We set , where is a parameter capturing the subtle balance between the positive contribution from the trip distance and negative contribution from the docking distance to the final profit. We set for the experiments. We consider each single day as the time horizon and set the total number of rounds by discretizing the hour period into a timestep of minutes. Throughout this section, we use timestep and round interchangeably.
Training. We use the training dataset of days to learn various parameters. As for the arrival rates , we count the total number of appearances of each request type at timestep in the parts (denote it by ) and set under KAD. When assuming KIID, we set (the arrival distributions are assumed the same across all the timesteps for each ). The estimation of parameters for the two different occupation time distributions are processed as follows. We first compute the average number of seconds between two requests in the dataset (note this was minutes in the experimental setup). We then assume that each timestep of our online process corresponds to a timedifference of this average in seconds. We then compute the sample mean and sample variance of the trip lengths (as number of seconds taken by the trip divided by five minutes) in the
parts. Hence we use the normal distribution obtained by this sample mean and standard deviation as the distribution with which a car is unavailable. We assign the docking position of each car to the location (in the discretized space) in which the majority of the requests were initiated (starting location of a request) and matched to this car.
5.2 Justifying The Two Important Model Assumptions
Known Adversarial Distributions. Figure 6 plots the number of arrivals of a particular type at various times during the day. Notice the significant increase in the number of requests in the middle of the day as opposed to the mornings and nights. This justified our arrival assumption of KAD which assumes different arrival distributions at different timesteps. Hence the (and the corresponding algorithm) can exploit this vast difference in the arrival rates and potentially obtain improved results compared to the assumption of Known Identical Independent Distributions (KIID). This is confirmed by our experimental results shown in Figures 3 and 3.
Identical Occupation Time Distribution. We assume each car will be available again via an independent and identical random process regardless of the matches it received. The validity of our assumptions can be seen in Figures 6 and 6, where the axis represents the different occupation time and the axis represents the corresponding number of requests in the dataset responsible for each occupation time. It is clear that for most requests the occupation time is around 23 timesteps and dropping drastically beyond that with a long tail. Figure 6 displays occupation times for two representative (we chose two out of the many cars we plotted, at random) cars in the dataset; we see that the distributions roughly coincide with each other, suggesting that such distributions can be learned from historical data and used as a guide for future matches.
5.3 Results
Inspired by the experimental setup by [Tong et al.2016a, Tong et al.2016b], we run five different algorithms on our dataset. The first algorithm is the . In this algorithm, when a request arrives, we choose a neighbor with probability with if is available. Here is an optimal solution to our benchmark and is the arrival rate of type at timestep . The second algorithm is called . Recall that is the set of “safe" or available assignments with respect to when the type arrives at . Let . In , we sample a safe assignment for with probability . The next two algorithms are heuristics oblivious to the underlying . Our third algorithm is called which is as follows. When a request comes, match it to the safe neighbor with the highest edge weight. Our fourth algorithm is called which chooses one of the safe neighbors uniformly at random. Finally, we use a combination of oblivious algorithm and based algorithm called . In this algorithm when a type comes, with probability we use the greedy choice and with probability we use the optimal choice. In our algorithm, we optimized the value of and set it to . We summarize our results in the following plots. Figures 3, 3, and 3 show the performance of the five algorithms and (optimal value of the benchmark ) under the different assumptions of the OTD (normal or power law) and online arrives (KIID or KAD). In all three figures the xaxis represents test dataset number and the yaxis represents average weight of matching.
Discussion. From the figures, it is clear that both the based solutions, namely  and , do better than choosing a free neighbor uniformly at random. Additionally, with distributional assumptions the based solutions outperform greedy algorithm as well. We would like to draw attention to a few interesting details in these results. Firstly, compared to the optimal solution, our based algorithms have a competitive ratio in the range of to . We believe this is because of our experimental setup. In particular, we have that the rates are high () only in a few timesteps while in all other timesteps the rates are very close to . This means that it resembles the structure of the theoretical worst case example we showed in Section 4. In future experiments, running our algorithms during peak periods (where the request rates are significantly larger than ) may show that competitive ratios in those cases approach . Secondly, it is surprising that our algorithm is fairly robust to the actual distributional assumption we made. In particular, from Figures 3 and 3 it is clear that the difference between the assumption of normal distribution versus powerlaw distribution for the unavailability of cars is negligible. This is important since it might not be easy to learn the exact distribution in many cases (cases where the sample complexity is high) and this shows that a close approximation will still be as good.
6 Conclusion and Future Directions
In this work, we provide a model that captures the application of assignment in ridesharing platforms. One key aspect in our model is to consider the reusable aspect of the offline resources. This helps in modeling many other important applications where agents enter and leave the system multiple times (organ allocation, crowdsourcing markets [Ho and Vaughan2012], and so on). Our work opens several important research directions. The first direction is to generalize the online model to the batch setting. In other words, in each round we assume multiple arrivals from
. This assumption is useful in crowdsourcing markets (for example) where multiple tasks—but not all—become available at some time. The second direction is to consider a Markov model on the driver starting position. In this work, we assumed that each driver returns to her docking position. However, in many ridesharing systems, drivers start a new trip from the position of the lastdrop off. This leads to a Markovian system on the offline types, as opposed to the assumed static types in the present work. Finally, pairing our current work with more applied stochastic optimization and reinforcement learning approaches would be of practical interest to policymakers running taxi and bikeshare services
[Singhvi et al.2015, O’Mahony and Shmoys2015, Lowalekar, Varakantham, and Jaillet2016, Verma et al.2017, Ghosh et al.2017].References
 [Adamczyk, Grandoni, and Mukherjee2015] Adamczyk, M.; Grandoni, F.; and Mukherjee, J. 2015. Improved approximation algorithms for stochastic matching. In ESA15.
 [Alaei, Hajiaghayi, and Liaghat2012] Alaei, S.; Hajiaghayi, M.; and Liaghat, V. 2012. Online prophetinequality matching with applications to ad allocation. In EC12.
 [Alaei, Hajiaghayi, and Liaghat2013] Alaei, S.; Hajiaghayi, M.; and Liaghat, V. 2013. The online stochastic generalized assignment problem. In APPROXRANDOM13.
 [Bertsimas, Farias, and Trichakis2013] Bertsimas, D.; Farias, V. F.; and Trichakis, N. 2013. Fairness, efficiency, and flexibility in organ allocation for kidney transplantation. Operations Research 61(1).
 [Brubach et al.2016] Brubach, B.; Sankararaman, K. A.; Srinivasan, A.; and Xu, P. 2016. New algorithms, better bounds, and a novel model for online stochastic matching. In ESA16.
 [Buchbinder, Jain, and Naor2007] Buchbinder, N.; Jain, K.; and Naor, J. S. 2007. Online primaldual algorithms for maximizing adauctions revenue. In ESA07.
 [Devanur et al.2011] Devanur, N. R.; Jain, K.; Sivan, B.; and Wilkens, C. A. 2011. Near optimal online algorithms and fast approximation algorithms for resource allocation problems. In EC11.

[Dickerson and
Sandholm2015]
Dickerson, J. P., and Sandholm, T.
2015.
FutureMatch: Combining human value judgments and machine learning to match in dynamic environments.
In AAAI15.  [Feldman et al.2009] Feldman, J.; Mehta, A.; Mirrokni, V.; and Muthukrishnan, S. 2009. Online stochastic matching: Beating 11/e. In FOCS09.

[Ghosh et al.2017]
Ghosh, S.; Varakantham, P.; Adulyasak, Y.; and Jaillet, P.
2017.
Dynamic repositioning to reduce lost demand in bike sharing systems.
Journal of Artificial Intelligence Research (JAIR)
58:387–430.  [Goel and Mehta2008] Goel, G., and Mehta, A. 2008. Online budgeted matching in random input models with applications to adwords. In SODA08.
 [Haeupler, Mirrokni, and Zadimoghaddam2011] Haeupler, B.; Mirrokni, V. S.; and Zadimoghaddam, M. 2011. Online stochastic weighted matching: Improved approximation algorithms. In WINE11.
 [Ho and Vaughan2012] Ho, C.J., and Vaughan, J. W. 2012. Online task assignment in crowdsourcing markets. In AAAI12.
 [Jaillet and Lu2013] Jaillet, P., and Lu, X. 2013. Online stochastic matching: New algorithms with better bounds. Mathematics of Operations Research 39(3).
 [Karp, Vazirani, and Vazirani1990] Karp, R. M.; Vazirani, U. V.; and Vazirani, V. V. 1990. An optimal algorithm for online bipartite matching. In STOC90.
 [Lee et al.2004] Lee, D.H.; Wang, H.; Cheu, R.; and Teo, S. 2004. Taxi dispatch system based on current demands and realtime traffic conditions. Transportation Research Record: Journal of the Transportation Research Board (1882).
 [Lowalekar, Varakantham, and Jaillet2016] Lowalekar, M.; Varakantham, P.; and Jaillet, P. 2016. Online spatiotemporal matching in stochastic and dynamic domains. In AAAI16.
 [Ma2014] Ma, W. 2014. Improvements and generalizations of stochastic knapsack and multiarmed bandit approximation algorithms. In SODA14.
 [Mahdian and Yan2011] Mahdian, M., and Yan, Q. 2011. Online bipartite matching with random arrivals: an approach based on strongly factorrevealing lps. In STOC11.
 [Manshadi, Gharan, and Saberi2012] Manshadi, V. H.; Gharan, S. O.; and Saberi, A. 2012. Online stochastic matching: Online actions based on offline statistics. Mathematics of Operations Research 37(4).
 [Mattei, Saffidine, and Walsh2017] Mattei, N.; Saffidine, A.; and Walsh, T. 2017. Mechanisms for online organ matching. In IJCAI17.
 [Megow, Uetz, and Vredeveld2004] Megow, N.; Uetz, M.; and Vredeveld, T. 2004. Stochastic online scheduling on parallel machines. In WAOA04.
 [Megow, Uetz, and Vredeveld2006] Megow, N.; Uetz, M.; and Vredeveld, T. 2006. Models and algorithms for stochastic online scheduling. Mathematics of Operations Research 31(3).
 [Mehta et al.2007] Mehta, A.; Saberi, A.; Vazirani, U.; and Vazirani, V. 2007. Adwords and generalized online matching. Journal of the ACM (JACM) 54(5).
 [Mehta2012] Mehta, A. 2012. Online matching and ad allocation. Theoretical Computer Science 8(4).
 [Miller2008] Miller, M. 2008. Cloud computing: Webbased applications that change the way you work and collaborate online. Que publishing.
 [Neuen et al.2013] Neuen, B. L.; Taylor, G. E.; Demaio, A. R.; and Perkovic, V. 2013. Global kidney disease. The Lancet 382(9900).
 [O’Mahony and Shmoys2015] O’Mahony, E., and Shmoys, D. B. 2015. Data analysis and optimization for (citi) bike sharing. In AAAI15.
 [Saran et al.2015] Saran, R.; Li, Y.; Robinson, B.; Ayanian, J.; Balkrishnan, R.; BraggGresham, J.; Chen, J.; Cope, E.; Gipson, D.; He, K.; et al. 2015. US renal data system 2014 annual data report: Epidemiology of kidney disease in the United States. American Journal of Kidney Diseases 65(6 Suppl 1).
 [Seow, Dang, and Lee2010] Seow, K. T.; Dang, N. H.; and Lee, D.H. 2010. A collaborative multiagent taxidispatch system. IEEE Transactions on Automation Science and Engineering 7(3).
 [Singhvi et al.2015] Singhvi, D.; Singhvi, S.; Frazier, P. I.; Henderson, S. G.; O’Mahony, E.; Shmoys, D. B.; and Woodard, D. B. 2015. Predicting bike usage for new york city’s bike sharing system. In AAAI15 Workshop on Computational Sustainability.
 [Skutella, Sviridenko, and Uetz2016] Skutella, M.; Sviridenko, M.; and Uetz, M. 2016. Unrelated machine scheduling with stochastic processing times. Mathematics of operations research 41(3).

[Sun, Zhang, and Zhang2016]
Sun, X.; Zhang, J.; and Zhang, J.
2016.
Near optimal algorithms for online weighted bipartite matching in
adversary model.
Journal of Combinatorial Optimization
.  [Tong et al.2016a] Tong, Y.; She, J.; Ding, B.; Chen, L.; Wo, T.; and Xu, K. 2016a. Online minimum matching in realtime spatial data: experiments and analysis. Proceedings of the VLDB Endowment 9(12).
 [Tong et al.2016b] Tong, Y.; She, J.; Ding, B.; Wang, L.; and Chen, L. 2016b. Online mobile microtask allocation in spatial crowdsourcing. In ICDE16.
 [Verma et al.2017] Verma, T.; Varakantham, P.; Kraus, S.; and Lau, H. C. 2017. Augmenting decisions of taxi drivers through reinforcement learning for improving revenues.
 [Yiu et al.2008] Yiu, M. L.; Mouratidis, K.; Mamoulis, N.; et al. 2008. Capacity constrained assignment in spatial databases. In SIGMOD08.
 [Younge et al.2010] Younge, A. J.; Von Laszewski, G.; Wang, L.; LopezAlarcon, S.; and Carithers, W. 2010. Efficient resource management for cloud computing environments. In International Green Computing Conference.
7 Supplementary Materials
7.1 Proof of Lemma 3
We show by induction on as follows. When , for all and we are done since
Assume for all , and is valid for all rounds before . In other words, each is made with probability equal to for all . Now consider a given . Observe that is unsafe at iff is assigned with some at such that the assignment makes unavailable at . Therefore
Thus from the constraints (3) in our benchmark LP, we see
Thus we are done since, .
7.2 Proof of Lemma 4
The inequality for is due to the fact that is safe at . For each time , Let be the event that is safe at and be the event that is matched at . Observe that for each window of time slots , are disjoint events. Therefore,
7.3 Proof of Lemma 4
Proof.
Focus on a given . Notice that for all . Sum all equations over , we have
Therefore we have
Define . From the above analysis, we have . Thus the objective value in the program (10) should be at most
We claim that the optimal value to the program (10) can be upper bounded by the below maximization program:
According to our assumption , the second term can be ignored. Let . For any , it is a concave function, which implies that maximization of subject to will be achieved when all . The resultant value is . Thus we are done. ∎
Comments
There are no comments yet.