# Online Policies for Efficient Volunteer Crowdsourcing

## Authors

• 5 publications
• 2 publications
• ### Scheduling Tasks for Software Crowdsourcing Platforms to Reduce Task Failure

Context: Highly dynamic and competitive crowd-sourcing software developm...
05/29/2020 ∙ by Jordan Urbaczek, et al. ∙ 0

We introduce the problem of Task Assignment and Sequencing (TAS), which ...
01/15/2016 ∙ by Heinz Schmitz, et al. ∙ 0

• ### Optimal Growth in Repeated Matching Platforms: Options versus Adoption

We study the design of a decentralized platform in which workers and job...
05/21/2020 ∙ by Irene Lo, et al. ∙ 0

• ### Online Multistage Subset Maximization Problems

Numerous combinatorial optimization problems (knapsack, maximum-weight m...
05/10/2019 ∙ by Evripidis Bampis, et al. ∙ 0

• ### Remote Estimation in Decentralized Random Access Channels

Efficient sampling and remote estimation is critical for a plethora of w...
07/07/2020 ∙ by Xingran Chen, et al. ∙ 0

• ### RetroRenting: An Online Policy for Service Caching at the Edge

The rapid proliferation of shared edge computing platforms has enabled a...
12/24/2019 ∙ by Lakshmi Narayana, et al. ∙ 0

Online two-sided matching markets such as Q&A forums (e.g. StackOverflow...
03/02/2017 ∙ by Virag Shah, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

We develop two randomized policies that are based on ex ante fractional solutions that can be computed in polynomial time. In order to assess the performance of our policies, we use a linear program benchmark whose optimal value serves as an upper bound on the value of a clairvoyant solution which knows the sequence of arrivals a priori as well as the state of volunteers at each time (see Program (LP), Proposition

3 and Definition 3). We remark that the platform’s objective—maximizing the number of completed tasks—jointly depends on the response of all volunteers and exhibits diminishing returns. For example, if the platform notifies two active volunteers and about a task , then the probability of completion would be where and are the match probabilities of the pairs and

, respectively. This objective function presents two challenges: (1) an ex ante solution based on upper bounding such an objective function by a piecewise linear one can be ineffective in practice, and (2) jointly analyzing volunteers’ contribution for an online policy while keeping track of the joint distribution of their states (active or inactive) is prohibitively difficult. We address the former challenge by computing ex ante solutions that “better” approximate the true objective function as opposed to only relying on the LP solution (see Programs (

AA) and (SQ-) and Proposition 5). We overcome the latter one by assuming an artificial priority among volunteers which allows us to decouple their contributions (see Definition 11 and Lemma 12). Attempting to follow the fractional ex ante solution can result in poor performance since volunteers can become inactive at inopportune times (see Appendix 11.2). Therefore, in the design of our policies, we modify the ex ante solution to account for inactivity while guaranteeing a constant-factor competitive ratio. Our first policy, the Scaled-Down Notification (SDN) Policy, relies on a prori computing the probability that a volunteer is active when following this policy. Equipped with these probabilities, the SDN policy notifies each volunteer such that the joint probability that a volunteer is active and notified is proportional to the ex ante solution (see Algorithm 1 and the preceding discussion). On the other hand, our second policy, the Sparse Notification (SN) Policy, relies on solving a sequence of Dynamic Programs (DPs)—one for each volunteer—to resolve the trade-off between notifying a volunteer now and saving her for future tasks. We solve the DPs in order of volunteers’ artificial priorities, and each subsequent DP is formulated based on the previous solutions (see Algorithm 2 and the preceding discussion). Our policies are parameterized by the minimum discrete hazard rate (MDHR) of the inter-activity time distribution, which serves as a sufficient condition for the level of “activeness” of volunteers (see Definition 5 and the following discussion). We analyze the competitive ratios of both policies as functions of the MDHR. Interestingly, both policies achieve the same competitive ratio (see Theorems 1 and 2). However, the SN policy demonstrates significantly better performance in practice (as shown and discussed in Section 6). The analysis of both policies relies on decomposing the problem into individual contributions based on our (artificial) priority scheme. Further, the analysis of SDN relies on proving that the probability of being active can be computed in advance and in polynomial time (see Section 4.2 and Appendix 9.3). The analysis of SN crucially uses the dual-fitting framework of Alaei et al. (2012) and it relies on formulating a linear program along with its dual to place a lower bound on the optimal value of each volunteer’s DP (see Section 4.3 and Appendix 9.7). Upper Bound on Online Policies: In order to gain insight into the limitation of online policies when compared to our benchmark, we develop an upper bound on the achievable competitive ratio of any online policy. Like our policies, the upper bound is parameterized by the MDHR (see Theorem 5). As a consequence, the gap between the achievable upper bound and our lower bound (attained through our policies) depends on the MDHR (see Figure 2). When it is small but positive, the gap is fairly small; however, the gap grows as the MDHR increases. Our upper bound relies on analyzing two instances, one of which provides a relatively tight upper bound when the MDHR is small. Testing on FRUS Data:

In order to illustrate the effectiveness of our modeling approach and our policies in practice, we evaluate the performance of our policies by testing them on FRUS’s data from different locations. In Section 6, we describe how we estimate model primitives and construct problem instances. Then we numerically show the superior perform of our policies when compared to strategies that resemble the current practice at different locations. The rest of the paper is organized as follows. In Section

2, we review the related literature. In Section 3, we formally introduce the online volunteer notification problem as well as the benchmark and the measure of competitive ratio. Section 4 is the main algorithmic section of the paper and is devoted to describing and analyzing our two online policies. In Section 5, we present our upper bound on the achievable competitive ratio of any online policy. In Section 6, we revisit the FRUS application and demonstrate the effectiveness of our policies by testing them on the platform’s data from various locations. Section 7 concludes the paper. For the sake of brevity, we only include proof ideas in the main text. A detailed proof of each statement is provided in the referenced appendix.

## 3 Model

. To capture the minimum rate at which volunteers transition from inactive to active, we define the minimum discrete hazard rate of the inter-activity time distribution as follows: [Minimum Discrete Hazard Rate] For a probability distribution

, the minimum discrete hazard rate (MDHR) is given by , where denotes the corresponding CDF.666By convention, if the fraction is , we define it to be equal to 1. Note that a large value of

is a sufficient condition to ensure that volunteers’ activity level is high. For example, if

 1−∏v∈U(1−pv,st)≤min{∑v∈Upv,st,1}

In words, we can upper bound the success probability of a subset with a piecewise-linear function that is the minimum of the expected total number of volunteer responses and . Second, recall that the clairvoyant solution only notifies active volunteers and does not know how long those notified volunteers will remain inactive. As a consequence, we can upper bound the clairvoyant solution via the following program which we denote by (LP):

 LPI=maxx T∑t=1S∑s=1λs,tmin{V∑v=1xv,s,tpv,s,1} (LP∗) s.t. 0≤xv,s,t≤1 ∀v,s,t (1) 1≥t∑τ=1S∑s=1λs,txv,s,t(1−G(t−τ)) ∀v,t (2)

With a slight abuse of terminology, we refer to this program with a piecewise-linear objective as (LP) because it can be expressed as a linear program by adding a constraint which ensures the linearity of the objective function. The decision variables represent the probability of notifying volunteer when a task of type arrives at time . Constraint (1) ensures that is a valid probability. Constraint (2) places limits on the frequency with which volunteers can be notified according to the inter-activity time distribution. In particular, the clairvoyant solution will only notify an active volunteer who will then become inactive for a random number of periods. Thus, in expectation the clairvoyant solution must meet constraint (2). For ease of reference, in the following, we define the set of all feasible solutions to (LP). Such a definition proves helpful in the rest of the paper. [Feasible Set] For any , if and only if it satisfies constraints (1) and (2). The following proposition, which we prove in Appendix 8.1, establishes the relationship between the clairvoyant solution and : [Upper Bound on the Clairvoyant Solution] For any instance of the online volunteer notification problem, is an upper bound on its clairvoyant solution. In light of Proposition 3, we use as a benchmark against which we compare the performance of any policy. Consequently, we define the competitive ratio of an online policy as follows: [Competitive Ratio] An online policy is -competitive for the online volunteer notification problem if for any instance , we have: , where represents the expected number of completed tasks by the online policy for instance . We will use the competitive ratio as a way to quantify the performance of an online policy. For each of our two policies (presented in the following section), the competitive ratio is parameterized by the MDHR, , and it improves as increases.

## 4 Online Policies

In this section, we present and analyze two policies for the online volunteer notification problem. Both policies are randomized and rely on a fractional solution we compute ex ante using the instance primitives. Thus, we begin this section by introducing the ex ante solution in Section 4.1. We then proceed to describe our algorithms and analyze their competitive ratios in Sections 4.2 and 4.3.

### 4.1 Ex Ante Solution

As stated in Section 1, both of our online policies rely on an ex ante solution which we denote by . Given our benchmark, we focus our attention on solutions that are feasible in (LP), i.e., (see Definition 3). Clearly, —the solution to (LP) in Section 3—is a potential ex ante solution. However, in practice, such a solution can prove ineffective because it does not take into account the diminishing returns of notifying an additional volunteer about a task. As a result, it may ignore some tasks while notifying an excessive number of volunteers about others (e.g., see Appendix 11.1). Given any

, for a moment, suppose volunteers are always active. Then if we notify each volunteer independently according to

, the expected number of completed tasks would be:999Since a task can only be completed if one arrives, we limit all sums to task types indexed from to .

 f(x):=T∑t=1S∑s=1λs,t(1−V∏v=1(1−xv,s,tpv,s)). (3)

Because is the optimal solution of a piecewise-linear objective, it ignores the submodularity in .101010We remark that we design our online policies such that they achieve a constant factor of as defined in (3). In light of this intuition, we introduce two other candidates that can be computed in polynomial time. First, we aim to find the feasible point that maximizes . We denote this optimization problem by (AA) which stand for Always Active. Even though AA is -hard (Bian et al. 2017), simple polynomial-time algorithms such as the variant of the Frank-Wolfe algorithm described below (proposed in Bian et al. (2017)) are known to work well in practice. The algorithm iteratively maximizes a linearization of and returns a convex combination of feasible solutions, which therefore must be feasible. We denote the output of this algorithm by and use it as another candidate for the ex ante solution.

(AA)
Approximating AA via Frank-Wolfe variant with step size : Set . For from to : Solve Set Return Note that the expected number of completed tasks, as defined in (3), jointly depends on the contributions of all volunteers. This property makes optimizing such an objective challenging. Further, when assessing any online policy, jointly analyzing volunteers’ contribution while keeping track of the joint distribution of their state (active or inactive) is prohibitively difficult.111111We remind that is the objective in the hypothetical case where all volunteers are always active. We overcome this challenge by defining the following artificial priority scheme among volunteers which enables us to “decouple” the contributions of volunteers and find our last candidate for the ex ante solution. [Index-Based Priority Scheme] Under the index-based priority scheme, if multiple volunteers respond to a notification, the one with the smallest index completes the task. 121212Note that this priority scheme is without loss of generality, since in the online volunteer notification problem, all volunteers who respond to a notification become inactive for a duration drawn from an identical distribution. Following the index-based priority scheme allows us to define individual contributions for each volunteer as shown in the following lemma (proven in Appendix 9.1). [Volunteer Priority-Based Contributions] For any , where is defined in (3) and

 (4)

For any , the term in (4) represents the probability that under the index-based priority scheme, volunteer is the lowest-indexed volunteer to respond positively to a notification about task at time . Further, this term only depends on the fractional solution of volunteers with lower index than . In addition, if we treat as fixed for , then is linear in . In light of these observations, we define our last candidate as the solution of a sequence of linear programs in which volunteers maximize their individual contributions in the order of their priority. This is summarized in the program (SQ-). For from to :

 max{xv,s,t:s∈[S],t∈[T]} T∑t=1S∑s=1λs,t(∏u≤v(1−pu,sxSQu,s,t))pv,sxv,s,t (SQ-v) subject to  0≤ xv,s,t≤1 ∀s,t 1≥ t∑τ=1S∑s=1λs,txv,s,t(1−G(t−τ)) ∀t

For a given volunteer , the program (SQ-) uses the solutions from previous iterations, i.e., for . As a result, this solution takes into account the diminishing returns from notifying multiple volunteers. We denote the solution to these sequential LPs as . Finally, we remark that the above decoupling idea proves helpful in both designing and analyzing our online policies. Having three candidates, we define

 x∗:=argmaxx∈{x∗LP,x∗AA,x∗SQ}f(x) (5)

The following proposition establishes a lower bound on based on the benchmark . [Lower Bound on Ex Ante Solution] For defined in (5),

 f(x∗)≥(1−1e)LP.

The above worst case ratio is achieved by the ratio of to , and it is tight. However, we stress that and can provide significant improvements. A simple example illustrating this point can be found in Appendix 11.1, while a full proof of Proposition 5 can be found in Appendix 9.2. When testing our policies on FRUS data (as detailed in Section 6), we find that using instead of results in an average improvement of 5% up to maximum of 23%. We conclude this section by noting that an online policy which directly follows (i.e., a policy that at time , upon arrival of , notifies volunteer independently with probability ) does not achieve a good competitive ratio. This stems from the fact that “respects” the inactivity period of volunteers only in expectation. Consequently, it is possible that volunteers are inactive when high-value tasks arrive (e.g. tasks where the match probability is close to ) because they were notified earlier (according to ) for low-value tasks. We present an illustrative example in Appendix 11.2. Therefore, we develop two policies based on two different modifications of the ex ante solution: (1) properly scaling it down and (2) sparsifying it. The former guides our first policy which we call the scaled-down notification policy, whereas the latter guides our second policy, referred to as the sparse notification policy. These policies are described and analyzed in the next two sections, respectively.

In this section, we present our scaled-down notification (SDN) policy which is a non-adaptive randomized policy that independently notifies volunteers according to a predetermined set of probabilities based on .131313Some of the ideas used in our SDN policy are similar to the adaptive algorithm of Dickerson et al. (2018). The policy relies on the following ideas: (1) Fixing a policy, suppose we can compute the ex ante probability that any volunteer is active at time when following that policy. Let us denote such an ex ante probability by . Then if arrives at time , we notify with probability where . As a result, she will be active and notified with probability . (2) If she was the only notified volunteer, then her probability of completing this task would be simply . Even though this is not the case, using the index-based priority scheme and the contribution decoupling idea in Lemma 12, we can show her contribution will be proportional to . (3) Consequently, we would like to set as large as possible. However, cannot be larger than since notification probabilities cannot exceed . Thus in the design of the policy, we find the largest feasible , which we prove to be where is the MDHR of the inter-activity time distribution (see Definition 5). The formal definition of our policy is presented in Algorithm 1. In the rest of this section, we analyze the competitive ratio of the SDN policy. Our main result is the following theorem:

In this section, we present our second policy, the sparse notification (SN) policy, which relies on a different modification of the ex ante solution. Before describing the policy, we briefly discuss our motivation for designing a second policy. Though simple and intuitive, the SDN policy only relies on the ex ante solution to resolve the trade-off between the immediate reward of notifying a volunteer and saving her for a future arrival. To see this, note that even in the last period , the SDN policy follows a scaled-down version of . To more accurately resolve this trade-off, in designing the SN policy, we utilize the ex ante solution and the index-based priority scheme (see Definition 11) to formulate a sequence of one-dimensional DPs whose optimal value will serve as a lower bound on the contribution of each volunteer according to her priority (as shown in Lemma 2). The solution of the these DPs is a sparsified version of the ex ante solution . Namely, let us denote as the solution of the sequence of DPs. For any , , and , is either or . Equipped with , which we compute in advance, the SN policy simply follows in the online phase. Our DP formulation and its analysis follows the framework developed in Alaei et al. (2012) and Alaei (2014), which is also used in Feng et al. (2019). Next we describe the DP formulation. Consider volunteer and suppose we have already solved the first DPs. Thus we have . Let us denote the value-to-go of the DP at time by . Clearly . We set ’s reward at time for task to be

 rv,s,t:=pv,sv−1∏u=1(1−~xu,s,tpu,s)\lx@notefootnoteWeemphasizethatthisisnottheactualreward,i.e.,itisnottheprobabilitythatvolunteer$v$completestask$s$undertheindex−basedpriorityscheme.However,itisalowerbound,asshownintheproofofLemma???. (6)

The actions available when task arrives at time are to notify with probability or to not notify . Thus when deciding on the optimal action (which can be either or ), we compare the (current and future) reward of notifying now to the reward of saving her for the next period. Formally,

 ~xv,s,t=x∗v,s,tI(rv,s,t+T∑τ=t+1g(τ−t)Jv,τ≥Jv,t+1) (7)

The term in the indicator on the left hand side is the reward of notifying in the current period , which consists of two parts: (1) the immediate reward we get from notifying —which will make her inactive for periods—and (2) the future reward once she becomes active again. The right hand side within the indicator simply represents the reward when is not notified and remains active in period . Given (6), (7), and , we can iteratively compute as follows:

 Jv,t=S∑s=1λs,t((1−~xv,s,t)Jv,t+1+~xv,s,t(rv,s,t+T∑τ=t+1g(τ−t)Jv,τ)) (8)

The formal definition of our policy is presented in Algorithm 2. In the rest of this section, we analyze the competitive ratio of the SN policy. Our main result is the following theorem:

[Competitive Ratio of the Sparse Notification Policy] Suppose the MDHR of the inter-activity time distribution is . Then the sparse notification policy, defined in Algorithm 2, is -competitive. A few remarks are in order: (1) The competitive ratio of our two policies are identical, implying that in the worst case they guarantee the same performance. However, for practical instances, the SN policy performs significantly better (as shown by our test results in Section 6). Intuitively, this is because the design of the SN policy explicitly aims to optimally resolve the trade-off between notifying a volunteer now or keeping her active for later based on . On the other hand, the design of the SDN policy only aims to proportionally follow . As a result, the SDN policy’s numerical performance is not substantially better than its worst-case guarantee. On the other hand, the SN policy can perform much better than its worst-case guarantee (see Appendix 11.3 for an illustrative example). (2) Similar to the SDN policy, the competitive ratio of the SN policy improves when increases. However, the design of the SN policy does not directly make use of . The proof of Theorem 2 consists of two main lemmas. First, in the following lemma, we lower bound the contribution of each volunteer by : [Volunteer Priority-Based Contribution under the SN Policy] Under the index-based priority scheme (in Definition 11) and the SN policy, the contribution of volunteer , i.e., the expected number of tasks she completes, is at least , where is defined in (8). The proof of this lemma follows from the DP formulation as well as the observation that for any , , and , the probability that a higher-priority volunteer completes the task is upper bounded by . A full proof can be found in Appendix 9.6. The second main step of the proof is to compare to the benchmark . In order to do so, we follow the dual-fitting approach of Alaei et al. (2012). In particular, given the inter-activity time distribution, we set up a linear program to find the “worst” possible combination of per-stage rewards that give rise to the minimum possible value of . Finding the optimal solution to this LP proves to be difficult. Instead we find a feasible solution to its dual, which enables us to lower bound . The LP and its dual are presented in Table 1. In the LP formulation, the first two sets of constraints follow from the DP definition. Note that the value of will crucially depend on , e.g., if for all , , and , then . This motivates the final constraint, which provides a constant against which we can compare . The following lemma establishes a lower bound on .

[Lower Bounding the Dynamic Program] Under the index-based priority scheme (see Definition 11), for any and volunteer , we have where is defined in (4). The proof of Lemma 1 (presented in Appendix 9.7) amounts to confirming that setting and defining all other dual variables such that the constraints hold with equality is a feasible solution to (Dual). Given Lemma 1, we complete the proof of Theorem 2 by applying Lemma 12 and Proposition 5. The complete proof is presented in Appendix 9.8.

## 5 Upper Bound on Competitive Ratio

In this section, we provide an upper bound on the competitive ratio of any online policy in the online volunteer notification problem. Like the lower bound achieved by our policies in Section 4, the upper bound is parameterized by the MDHR of the inter-activity time distribution, . The main result of this section is the following theorem: [Upper Bound on Achievable Competitive Ratio] Suppose the MDHR of the inter-activity time distribution is where . Then no online algorithm can achieve a competitive ratio greater than , where for

 κ=min{12−q,1+q−q(1−q)log(11−q)(1+q)(1−e−1)}. (9)

and for , we have .161616We remark that the condition imposed on when is added for ease of presentation of the theorem statement as well as its proof. Relaxing the aforementioned condition amounts to modifying the second term in by rounding any up to the closest and slightly modifying the instance in the proof. We omit these details for the sake of brevity. Figure 2 provides a summary of our lower and upper bounds on the achievable competitive ratio for the online volunteer notification problem as a function of . We make the following observations based on the theorem and accompanying plot: (1) the upper bound applies to all policies, even those that cannot be computed in polynomial time, (2) both the upper and lower bounds improve as increases, and (3) the competitive ratio of our online policies are fairly close to the upper bound when is small but positive. However, the gap grows for larger values of .

The proof of Theorem 5 relies on analyzing the following two instances, each giving one of the terms in the definition of as shown in (9). Instance attains the minimum when whereas Instance attains it when . Instance : Suppose , , , and , where . The arrival probabilities are given by and , where . The volunteer match probabilities are given by and . The left panel of Figure 3 visualizes Instance . The following lemma—which we prove in Appendix 10.1—states that no online policy can complete more than a fraction of . [Upper Bound for Instance ] In instance , The expected number of completed tasks under any online policy is at most . Before proceeding to the second instance, we make two remarks: (1) If , the above instance is equivalent to the canonical instance used in the prophet inequality to establish an upper bound of (see, e.g., Hill and Kertz (1992)). (2) The term in the competitive ratio of both policies corresponds to the gap between (defined in (3)) and the benchmark , whereas the corresponds to the gap between the performance of our online policy and due to the loss in the online phase.In Instance , there is only one volunteer and consequently . Therefore, Instance shows that the lower bound achieved in the online phase of our policies is tight, as they both attain at least . The construction of our second instance is more delicate as it aims to find an instance for which both the loss in the offline phase (i.e., the gap between and ) and the loss in the online phase (i.e., the gap between the performance of the online policy and ) are large. Instance : Suppose , , , and is the geometric distribution with parameter , e.g. . The arrival probabilities are given by and for . The volunteers are homogeneous with for all . The right panel of Figure 3 visualizes Instance . The following lemma—which is proven in Appendix 10.2—states that no online policy can complete more than a fraction of . [Upper Bound for Instance ] In instance , the expected number of completed tasks under any online policy is at most . The proof of this lemma involves three steps: (1) placing a lower bound on by finding a feasible solution, (2) establishing that always notifying every volunteer is the best online policy, and (3) assessing the performance of this policy relative to . A full proof can be found in Appendix 10.2.

## 6 Evaluating Policy Performance on FRUS Data

In this section, we use data from FRUS to evaluate the performance of the two online policies described in Section 4. First, we briefly explain how we use data to determine the model primitives. Then we exhibit the superior performance of our policies compared to policies that resemble the strategies used at various FRUS locations. Estimating model primitives: As explained in Section 3, in order to define an instance of the online volunteer notification problem, we must determine the match probabilities, i.e., ; the arrival rates of tasks, i.e., ; and the inter-activity time distribution .

• Match probabilities: As evidenced in Figure 1, volunteer preferences over tasks are heterogeneous and predictable. To come up with estimates

for each FRUS location, we first create a feature vector for each task. We then build a

-Nearest Neighbors classification model, tuning the parameter using cross-validation. The AUCs of such classification models range between 0.89 and 0.95 across tested locations.

• Arrival Rates: Recall that for FRUS, a task is a food rescue (donation) that remains available on the day of delivery. Most food rescues are repeated on a weekly cycle; therefore we define a type for each recurring rescue. Empirically, we observe a relationship between the last minute availability of a rescue of type and its status over the past six weeks (the correlation coefficient is between and across all tested locations). Therefore, we estimate as the proportion of times in the past six weeks that a rescue of type was a last-minute availability.

• Inter-activity time distribution: At FRUS, many site directors follow a policy of waiting at least a week before notifying the same volunteer about another last-minute food rescue. Consequently, we assume the inter-activity time is deterministic and equal to seven days, e.g. .

In the following, we compare the performance of our online policies to strategies that simulate the current practice at various FRUS locations using instances constructed with data from two different locations as described above. First, we compare our policies against ‘notify-1’ and ‘notify-3’ policies that, respectively, notify one and three volunteer(s) chosen uniformly at random among “eligible” volunteers. Note that here a volunteer is eligible if she has not been notified for at least 6 days. The top panels of Figure 4 display the ratio between each policy and across 50 simulations. We highlight that the SN policy significantly outperforms all other policies. Further note that the SN policy’s performance far exceeds its competitive ratio of , as given in Theorem 2, while the SDN policy performs only slightly above its competitive ratio.171717Part of why our policies outperform their competitive ratio is that in the FRUS locations studied, using as an ex ante solution improves on using by an average of , up to a maximum of . Next, we compare our policies against a ‘notify-all’ policy that sends a notification to all volunteers. This policy clearly does not respect the 7-day gap between two successive notifications. Therefore, here we assume that the inter-activity time distribution is geometric with an expected duration of 7 days. The bottom panels of Figure 4 display the ratio between each policy and across 50 simulations. Here, we also observe that the SN policy significantly outperforms all other policies as well as its worst-case guarantee.

## 7 Conclusion

In this paper, we take an algorithmic approach to a commonly faced challenge on volunteer-based crowdsourcing platforms: how to utilize volunteers for time-sensitive tasks at the “right” pace while maximizing the number of completed tasks. We introduce the online volunteer notification problem to model volunteer behavior as well as the trade-off that the platfrom faces in this online decision making process. We develop two online policies that achieve constant-factor guarantees parameterized by the MDHR of the volunteer inter-activity time distribution, which gives insight into the impact of volunteers’ activity level. The guarantees provided by our policies are close to the upper-bound we establish for the performance of any online policy. In this paper, we measure the performance of an online policy by comparing it to an LP-based benchmark which upper bounds a clairvoyant solution. From a theoretical perspective, considering other benchmarks (perhaps less strong) is an interesting future direction. This work is motivated by our collaboration with FRUS, a leading volunteer-based food recovery platform, analysis of whose data confirms that, by and large, volunteers have persistent preferences. Leveraging on historical data, we estimate the match probability between volunteer-task pairs as well as the arrival rate of tasks. This enables us to test our policies on FRUS data from different locations and illustrate their effectiveness compared to common practice. From an applied perspective, studying the robustness of our policies as well as developing decision tools that can be integrated with the FRUS app are immediate next steps that we plan to pursue. Finding other platforms that can benefit from our work is another direction for future work.

## References

• (1)
• Alaei (2014) Saeed Alaei. 2014. Bayesian combinatorial auctions: Expanding single buyer mechanisms to many buyers. SIAM J. Comput. 43, 2 (2014), 930–972.
• Alaei et al. (2012) Saeed Alaei, MohammadTaghi Hajiaghayi, and Vahid Liaghat. 2012. Online prophet-inequality matching with applications to ad allocation. In Proceedings of the 13th ACM Conference on Electronic Commerce. 18–35.
• Alaei et al. (2016) Saeed Alaei, Azarakhsh Malekian, and Mohamed Mostagir. 2016. A dynamic model of crowdfunding. Ross School of Business Paper 1307 (2016).
• Ata et al. (2016) Baris Ata, Deishin Lee, and Erkut Sonmez. 2016. Dynamic staffing of volunteer gleaning operations. Available at SSRN 2873250 (2016).
• Bian et al. (2017) An Bian, Baharan Mirzasoleiman, Joachim M Buhmann, and Andreas Krause. 2017. Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains.

Proceedings of Machine Learning Research

54 (2017), 111–120.
• Brudney and Meijs (2009) Jeffrey L Brudney and Lucas CPM Meijs. 2009. It ain’t natural: Toward a new (natural) resource conceptualization for volunteer management. Nonprofit and voluntary sector quarterly 38, 4 (2009), 564–581.
• de Zegher et al. (2018) Joann F de Zegher, Dan A Iancu, and Erica L Plambeck. 2018. Sustaining Smallholders and Rainforests by Eliminating Payment Delay in a Commodity Supply Chain—It Takes a Village. (2018).
• de Zegher and Lo (2019) Joann F de Zegher and Irene Lo. 2019. Crowdsourcing Information in Informal Supply Chains. (2019).
• Dickerson et al. (2018) John P Dickerson, Karthik A Sankararaman, Aravind Srinivasan, and Pan Xu. 2018. Allocation problems in ride-sharing platforms: Online matching with offline reusable resources. In

Thirty-Second AAAI Conference on Artificial Intelligence

.
• Feldman et al. (2009) Jon Feldman, Aranyak Mehta, Vahab Mirrokni, and Shan Muthukrishnan. 2009. Online stochastic matching: Beating 1-1/e. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science. IEEE, 117–126.
• Feng et al. (2019) Yiding Feng, Rad Niazadeh, and Amin Saberi. 2019. Linear Programming Based Online Policies for Real-Time Assortment of Reusable Resources. (2019).
• Golrezaei et al. (2014) Negin Golrezaei, Hamid Nazerzadeh, and Paat Rusmevichientong. 2014. Real-time optimization of personalized assortments. Management Science 60, 6 (2014), 1532–1551.
• Gong et al. (2019) Xiao-Yue Gong, Vineet Goyal, Garud Iyengar, David Simchi-Levi, Rajan Udwani, and Shuangyu Wang. 2019. Online assortment optimization with reusable resources. Available at SSRN 3334789 (2019).
• Han et al. (2019) Shuihua Han, Hu Huang, Zongwei Luo, and Cyril Foropon. 2019. Harnessing the power of crowdsourcing and Internet of Things in disaster response. Annals of Operations Research 283, 1-2 (2019), 1175–1190.
• Hill and Kertz (1992) Theodore P Hill and Robert P Kertz. 1992. A survey of prophet inequalities in optimal stopping theory. Contemp. Math 125 (1992), 191–207.
• Hu et al. (2015) Ming Hu, Xi Li, and Mengze Shi. 2015. Product and pricing decisions in crowdfunding. Marketing Science 34, 3 (2015), 331–345.
• Independent Sector (2018)  Independent Sector. 2018. Independent Sector Releases New Value of Volunteer Time of \$25.43 Per Hour.
• Jaillet and Lu (2014) Patrick Jaillet and Xin Lu. 2014. Online stochastic matching: New algorithms with better bounds. Mathematics of Operations Research 39, 3 (2014), 624–646.
• Karger et al. (2014) David R Karger, Sewoong Oh, and Devavrat Shah. 2014. Budget-optimal task allocation for reliable crowdsourcing systems. Operations Research 62, 1 (2014), 1–24.
• Lacetera et al. (2014) Nicola Lacetera, Mario Macis, and Robert Slonim. 2014. Rewarding volunteers: A field experiment. Management Science 60, 5 (2014), 1107–1129.
• Locke et al. (2003) Michael Locke, Angela Ellis, and Justin Davis Smith. 2003. Hold on to what you’ve got: the volunteer retention literature. Voluntary Action 5, 3 (2003), 81–99.
• Manshadi et al. (2012) Vahideh H Manshadi, Shayan Oveis Gharan, and Amin Saberi. 2012. Online stochastic matching: Online actions based on offline statistics. Mathematics of Operations Research 37, 4 (2012), 559–573.
• McElfresh et al. (2019) Duncan C. McElfresh, Christian Kroer, Sergey Pupyrev, Eric Sodomka, and John P. Dickerson. 2019. Matching Algorithms for Blood Donation. (2019). Workshop on Mechanism Design for Social Good MD4SG.
• Mehta et al. (2013) Aranyak Mehta et al. 2013. Online matching and ad allocation. Foundations and Trends® in Theoretical Computer Science 8, 4 (2013), 265–368.
• National Service (2015)  National Service. 2015. State Rankings by Volunteer Retention Rate.
• Papanastasiou et al. (2018) Yiangos Papanastasiou, Kostas Bimpikis, and Nicos Savva. 2018. Crowdsourcing exploration. Management Science 64, 4 (2018), 1727–1746.
• Rusmevichientong et al. (2020) Paat Rusmevichientong, Mika Sumida, and Huseyin Topaloglu. 2020. Dynamic assortment optimization for reusable products with random usage durations. Management Science (2020).
• Sampson (2006) Scott E Sampson. 2006. Optimization of volunteer labor assignments. Journal of Operations Management 24, 4 (2006), 363–377.
• Song et al. (2018) Yicheng Song, Zhuoxin Li, and Nachiketa Sahoo. 2018. Matching Donors to Projects on Philanthropic Crowdfunding Platform. Available at SSRN 3280276 (2018).
• Sönmez et al. (2016) Erkut Sönmez, Deishin Lee, Miguel I Gómez, and Xiaoli Fan. 2016. Improving food bank gleaning operations: An application in New York state. American Journal of Agricultural Economics 98, 2 (2016), 549–563.
• Stein et al. (2019) Clifford Stein, Van-Anh Truong, and Xinshang Wang. 2019. Advance service reservations with heterogeneous customers. Management Science (2019).
• Urrea et al. (2019) Gloria Urrea, Alfonso J Pedraza-Martinez, and Maria Besiou. 2019. Volunteer Management in Charity Storehouses: Experience, Congestion and Operational Performance. Production and Operations Management 28, 10 (2019), 2653–2671.
• Wang et al. (2018) Xinshang Wang, Van-Anh Truong, and David Bank. 2018. Online advance admission scheduling for services with customer preferences. arXiv preprint arXiv:1805.10412 (2018).

## 8 Proofs for Section 3

### 8.1 Proof of Proposition 3

To show that is an upper bound on the clairvoyant solution, we will construct a feasible solution based on the clairvoyant solution. We will then prove that the value of this solution is an upper bound on the value of the clairvoyant solution. Let us define the random realizations of inter-activity times as , where is the inter-activity time of volunteer if notified at time . In addition, we denote the random arrival sequence as , where is the arrival at time . Finally, suppose we have an indicator variable , which is equal to one if and only if the clairvoyant solution contacts volunteer at time when the arrival order is given by and the inter-activity times are given by . Because the clairvoyant solution does not know until after time , cannot depend on for . For any volunteer , task , and time , we define

 ^xv,j,t=∑→s∈→S∑→z∈→ZP(→S=→s|St=j)P(→Z=→z)ωv,t(→s,→z).

To show that (see Definition 3), we immediately note that , since we are summing indicator variables over probability distributions. We now need to show that constraint is met, namely that . Note that for a given sequence of arrivals and inter-activity times given by , we must have

 1 ≥t∑t′=1ωv,t′(→s,→z)I(zv,t′>t−t′) (10)

This is because both and are indicator variables, and if both equal at time , then the volunteer must be inactive until after time . Since the clairvoyant solution only notifies active volunteers, if volunteer is inactive from until after , then for all . Thus, the sum from to of the product of these indicator variables cannot exceed . We now take a weighted sum over all possible arrival sequences and inter-activity times:

 1≥ t∑t′=1∑→s∈→SP(→S=→s)∑→z∈→ZP(→Z=→z)ωv,t′(→s,→z)I(zv,t′>t−t′) (11) = t∑t′=1∑→s∈→SP(→S=→s)⎛⎝∑→z∈→ZP(→Z=→z)ωv,t′(→s,→z)⎞⎠⎛⎝∑→z∈→ZP(→Z=→z)I(zv,t′>t−t′)⎞⎠ (12) = t∑t′=1∑→s∈→SP(→S=→s)⎛⎝∑→z∈→ZP(→Z=→z)ωv,t′(→s,→z)⎞⎠(1−G(t−t′)) (13) = t∑t′=1S∑j=1λj,t′∑→s∈→SP(→S=→s|st′=j)⎛⎝∑→z∈→ZP(→Z=→z)ωv,t′(→s,→z)⎞⎠(1−G(t−t′)) (14) = t∑t′=1S∑j=1λj,t′^xv,j,t′(1−G(t−t′)) (15)

In line (12), we use the independence of and to rewrite the expected value of their product as the product of their expectations. We substitute in the expected value of in line (13). In line (14

), we use the law of total probability to sum over all possible arriving tasks in time

. We then substitute in the definition of in line (15). This proves that . It remains to be shown that exceeds the value of the clairvoyant solution. Let be the event that task arrives at time and is completed when following the clairvoyant solution. We must have . In addition, since a volunteer must respond in order to complete a task, we must have

 P(Cs,t)≤λs,t∑→s∈→S∑→z∈→ZP(→S=→s|St=j)P(→Z=→z)V∑v=1ωv,t(→s,→z)pv,s=λs,tV∑v=1^xv,s,tpv,s.

Combining these two bounds and summing over all tasks and time periods, we see that the clairvoyant solution must be less than