Ride-hailing marketplaces like Uber, Lyft, and Didi match millions of riders and drivers every day. A key component of these marketplaces is a surge (dynamic) pricing mechanism. On the rider side of the market, surge pricing reduces the demand to match the level of available drivers and maintains the reliability of the marketplace, cf., (Hall et al. 2015), and so allocates the rides to the riders with the highest valuations. On the driver side, surge encourages drivers to drive during certain hours and locations, as drivers earn more during surge (Lu et al. 2018, Hall et al. 2017, Chen and Sheldon 2016). Castillo et al. (2017) show that surge balances both sides of this spatial market by moderating the demand and the density of available drivers, hence avoiding so called “Wild Goose Chase” equilibria in which drivers spend much of their time on long distance pick ups. Due to these effects, surge pricing – along with centralized matching technologies – is often considered the primary reason that ride-sharing marketplaces outperform traditional taxi services on many metrics, including driver utilization and overall welfare (Cramer and Krueger 2016, Buchholz 2017, Ata et al. 2019).
However, variable pricing (across space and time) must be carefully designed, since it can create incentives for “cherry-picking” and rejecting certain trip requests. Such behavior increases earnings of strategic drivers at the expense of other drivers, who may then disproportionately receive such trip requests after they are rejected by others, cf., (Cook et al. 2018). It also reduces overall platform reliability, inconveniencing riders who may have to wait longer before receiving a ride.
Uber recently revamped its driver surge mechanism to simplify such decisions, in an attempt to improve the driver experience and making earnings more dependable (Uber 2019b). The main change is making surge “additive” instead of “multiplicative.” Under multiplicative surge, the payout of the driver from a surged trip scales with the length of the trip. In contrast, under additive surge, the surge component of the payout is constant (independent of trip length), with some adjustment for very long trips (Uber 2019d). Figure 1 depicts the heat-map of surge on the driver app for each type of surge. We show that this change directly addresses the issue that drivers who strategically reject trip requests may earn more than drivers who do not, even as total payments remain the same.
We consider the design of incentive compatible (IC) pricing mechanisms in the presence of surge. Trips differ by their length , and the platform sets the payout for each trip in each world state (i.e., surge vs non-surge). Drivers decide which trip requests to accept in each world state, in response to the payout function .111Drivers’ level of sophistication and experience varies, cf. Cook et al. (2018). An IC mechanism aligns the incentives of drivers to accept all trips, for any level of strategic response to pricing strategies. The technical challenge is to design a IC pricing mechanism , for which accepting all trips is an earning maximizing strategy for drivers over a long horizon, i.e. where in each world state maximizes driver earnings.
We first study a continuous-time, infinite horizon single-state model, where there is only one world state and trip requests arrive over time according to a stationary Poisson process. We show that in this model, multiplicative pricing — where the payout of a trip is proportional to the length of that trip — is incentive compatible. To obtain this result, we show in Theorem 3.1 that the best response strategy of a driver to function , to maximize her earning, is a threshold strategy where she accepts all trips with payout rate above that threshold. Hence, an mechanism that equalizes the payout rate of all trips is incentive compatible.
We then present a model where the world state stochastically transitions between surge and non-surge states, with trip payments, distributions, and intensity varying between states. In such a temporally dynamic system, completing a given trip affects a driver’s earnings beyond just the length of the trip, i.e. imposes a future-time externality on the driver that is a function of the trip length. The driver’s trip opportunity cost thus includes both what occurs during a trip, and a continuation value. This externality implies that multiplicative surge is not incentive compatible in the presence of surge (Theorem 3.3), in contrast to the single-state model. Namely, drivers can benefit from rejecting long trips in a non-surge state, and short trips in the surge state.
In our main result, Theorem 4.2, we propose a class of incentive compatible pricing functions that are described as closed forms of the model primitives. The prices account for the driver’s temporal externalities; e.g., during surge, short trips pay more per unit time than do long trips.
Finally, we consider additive and multiplicative surge in our dynamic model. Through numerical simulations,222Whenever possible, we fit the parameters of the Uber’s data. we show that the new additive surge mechanism has desirable incentive compatibility properties when compared to multiplicative surge. More specifically, we observe that under additive surge, drivers may benefit only from rejecting a small fraction of long trips, supporting the practical remedy that makes adjustments for certain long trips (Uber 2019d).
To our knowledge, ours is the first ride-hailing pricing work to incorporate dynamic (non-constant), stochastic demand and pricing. This component is essential to uncover the way through which a particular trip imposes substantial temporal externalities on a driver’s future earnings.
We discussed some of the related work on surge pricing above. Here, we briefly review the lines of research closest to ours. We refer the reader to a recent survey by Korolko et al. (2018) for a broader overview of the growing literature on ride-hailing markets.
Driver spatio-temporal strategic behavior. Several works model strategic driver behavior in a spatial network structure, and across time in a single-state. Ma et al. (2018)
develop spatially and temporally smooth prices that are welfare-optimal and incentive compatible in a deterministic model. Their prices form a competitive equilibrium and are the output of a linear program with integer solutions. We similarly seek to develop incentive compatible pricing schemes, and both works broadly construct VCG-like prices that account for driver opportunity costs. Our focus is on structural aspects (e.g. multiplicative in trip length) in a non-deterministic model.
Bimpikis et al. (2016) show how the platform would price trips between locations, taking into account strategic driver re-location decisions, in a single-state model with discrete locations. They show that pricing trips based on the origin location substantially improves surplus, as well as the benefits of “balanced” demand patterns. Besbes et al. (2018b) consider a continuous state space setting and show how a platform may optimally set prices across the space in reaction to a localized demand shock to encourage drivers to relocate; their model has driver cost to re-locate, but no explicit time dimension. They find that localized prices have a global impact, and, e.g., the optimal pricing solution incentivizes some drivers to move away from a demand shock. Afèche et al. (2018) consider a two state model with demand imbalances and compare platform levers such as limiting ride requests and directing drivers to relocate, in a two-state fluid model with strategic drivers. They upper-bound performance under these policies, and find that it may be optimal for the platform to reject rider demand even in over-supplied areas, to encourage driver movement. A similar insight is developed by Guda and Subramanian (2019). Finally, Yang et al. (2018) analyze a mean-field system in which agents compete for a location-dependent, time-varying resource, and decide when to leave a given location. They leverage structural results — agents’ equilibrium strategies depend just on the current resource level and number of agents — to numerically study driver decisions to relocate between locations as a function of the platform commission structure.
Dynamic pricing in ride-sharing and service systems. There is a growing literature on queuing and service systems motivated in part by ride-sharing market. For example, Besbes et al. (2018a) revisit the classic square root safety staffing rule in spatial settings, cf., Bertsimas and van Ryzin (1991, 1993). Much of the focus of this line of work is how pricing affects the arrival rate of (potentially heterogeneous) customers, and thus the trade-off between the price and rate of customers served in maximizing revenue.
Banerjee et al. (2015) consider a network of queues in which long-lived drivers enter the system based on their expected earnings but cannot reject specific trip requests. Under their model, dynamic pricing cannot outperform the optimal static policy in terms of throughput and revenue, but is more robust. Cachon et al. (2017) argue on the other hand that surge pricing and payments are welfare increasing for all market participants when drivers decide when to work. Similar in spirit to our work, Chen and Hu (2018) consider a marketplace with forward-looking buyers and sellers who arrive sequentially and can wait for better prices in the future. They develop strategy-proof prices whose variation over time matches the participants’ expected utility loss incurred by waiting.
One of the most related to our work in modeling approach, Kamble (2018) studies how a freelancer can maximize her long-term earnings with job-length-specific prices, balancing on-job payments and utilization time. In his model, a freelancer sets her own prices for a discrete number of jobs of different lengths and, with assumptions similar to our single-state model, it is optimal for the freelancer to set the same price per hour for all jobs. We further discuss the relationship of this work to our single-state model below.
The rest of the paper is organized as follows: In Section 2, we formally present our model. In Section 3, we formulate a driver’s best response strategy to affine pricing functions in each model. In Section 4, we present incentive compatible pricing functions for our surge model. Finally, in Section 5, we numerically compare the IC properties of additive and multiplicative surge.
We consider a large ride-hailing market with decoupled pricing, from the perspective a single driver. This driver receives trip requests of various lengths, whose rate, distribution, and payment are known to the driver but determined exogeneously to her decisions to accept or decline requests. We do not consider spatial heterogeneity in our setting, to abstract away the impact of location and focus on temporal opportunity cost and continuation value based on a length of the trip.333We believe our insight can be extended to a spatial setting where the price can be decomposed to a time-based component, based on the length of the trip, and a spatial component based on the destination of the trip. However, this would be beyond the scope of this work, cf., Bimpikis et al. (2016).
In this section, we first describe the primitives of our two models, a single-state model (Section 2.1) and a dynamic model with surge pricing (Section 2.2). Then we describe the driver’s strategy space and the platform’s pricing design challenge (Section 2.3).
2.1 Single-state model
We start with a model where there is a single world state, i.e. all distributions are constant over time. Time is continuous and indexed by . At each time , the driver is either open, or busy. While the driver is open, she receives job (trip) requests from riders according to a Poisson process at rate , i.e., the time between requests is exponential with mean . Job lengths, denoted by , are drawn independently and identically from continuous distribution .
If the driver accepts a job request of length at time (as discussed below), she receives a payout of at time , at which time she becomes open again. If the request is not accepted, the driver remains open. Note that our model of a trip is a simplification from practice, where a given job has two components: the time it takes to pick-up the rider, and the time while the rider is in the driver’s vehicle. To simplify the presentation, we combine these two components into trip length. When these components are separated, the space of driver strategies becomes richer, but results remain qualitatively the same.
Pricing function is assumed to be continuous and non-decreasing, and such that the hourly payment as a function of the trip length is asymptotically bounded, .
2.2 Dynamic model with surge pricing
A model with fixed pricing and arrival rates of jobs is not a realistic representation of ride-hailing platforms. In particular, rider demand (both in intensity and in distribution) may vary substantially over time. To study how this dynamic nature affects driver decisions, we consider a model with two states, , where denotes the surge state. (At a high level, the surge state provides a higher earnings rate to the driver. The precise definition of what distinguishes the surge state is presented in Section 3.2, after we formulate the driver’s earnings rate in each state).
The world evolves stochastically between the two states, as a Continuous Time Markov Chain (CTMC). When the world is in state, the state changes to according to a fixed exponential clock that ticks at rate , independently of other randomness.
When open in state , the driver receives job requests at rate with lengths , and collects payout according to payment function , which is presumed to have the same properties as in the single-state model. Note that the state of the world may change while a driver is on trip. The driver receives payments according to the state of the world when she starts a trip. We will use to denote the overall pricing mechanism.
2.3 Driver strategies and earnings
In our model, the driver can decide whether to accept the trip request, with no penalty.444This assumption follows Uber’s current practice. We further discuss the driver’s information set below. In the single-state model, let denote the driver’s (deterministic) strategy, where implies that a driver accepts job requests of length . In the dynamic model, the driver follows deterministic policy , where indicates the jobs she accepts in state . We assume that driver policies are measurable with respect to (corresponding in dynamic model); additionally, for technical reasons, in the dynamic model we also assume that consist of a union of open intervals, i.e. are open subsets of . When we write equalities with policies , we mean equality up to changes of measure 0.
The driver is long-lived and aims to maximize her lifetime average hourly earnings on the platform, including both open and busy times. Let denote the (random) total earnings from jobs accepted from time up to time if she follows policy and the payout function is ; see the appendix for a more formal definition. Then, the driver’s lifetime earnings rate is
This earnings rate is a (deterministic) function of driver policy , pricing function , and the primitives. We can now define notions of an optimal driver policy and incentive compatible pricing.
A driver policy is optimal (best-response) with respect to pricing function if it maximizes the lifetime earnings rate of the driver among all policies: , for all valid policies (i.e. measurable with respect to , with open sets). Then, pricing function is incentive compatible (IC) if accepting all job requests is optimal with respect to , i.e. in the single-state model or in the dynamic model is optimal with respect to . In other words, pricing function is incentive compatible if an earnings-maximizing driver (who knows all the primitives and ) accepts every trip request. In Section 5, we also consider an approximate notion of incentive compatible pricing, defined as the fraction of trips that are accepted under an optimal driver policy with respect to a given pricing function.
We assume that the platform reveals the total trip length to the driver at the time of request, and that the driver can freely reject it without penalty. We note that in current practice in ride-hailing markets, drivers often cannot see the rider’s destination or the trip length until they pick up the rider (but they can reject a request based on the pick-up time to the rider, without penalty). Some drivers call ahead to find out the rider’s destination or even cancel the trip at the pick-up location, creating negative experiences for both the rider and the driver.555We note that destination discrimination is against Uber’s guidelines and could lead to deactivation (Uber 2019a). Our notion of incentive compatibility is ex-post, implying that drivers would accept all trips, even if the trip length is not revealed, and so this setting from practice is covered as well.
Our setting is decoupled: rider and driver prices are determined separately. Namely, changes in the driver payout function and decisions do not change the trip request rate or the distribution of the trips. This modeling assumption follows the current practice (Uber 2019e) and furthermore allows us to focus on the drivers’ perspective, without further complicating the analysis.666Coupled pricing imposes more constraints on the pricing functions chosen by the platform. For example, Bai et al. (2018) find that the platform should adjust its payout ratio with demand – an example of decoupled pricing – to maximize profit or overall welfare.
3 Driver Reward and Affine Driver Pricing
In this section, we present the driver’s lifetime earnings rate , using the renewal reward theorem on an appropriately defined renewal process. For the dynamic model, this process is not immediate from the primitives defined; we break down the lifetime driver’s earning rate into: (1) a function of the fraction of her time she spends in each such state and (2) the earnings rates while the driver is open in each state or on a job that started in that state. This decomposition allows us to analyze the driver’s (best response) strategy to payout in our dynamic model.
We are especially interested in affine pricing schemes, where , with (in the single-state model: , with ). Such pricing functions can be clearly communicated as time and distance rates (see, e.g., (Uber 2019c)), or otherwise be displayed on a surge heat-map. We refer to the case with ( as positive (negative) affine pricing.
Below, we first characterize the driver’s best-response strategy with respect to any pricing function in the single-state model. We observe that multiplicative pricing — a special case of affine pricing where — is incentive compatible. In contrast, in Section 3.2, we show that in the dynamic model multiplicative pricing may no be longer incentive compatible. We further derive the structure of optimal driver policies in each state with respect to affine or multiplicative pricing, which will enable numerical study of the (approximate) incentive compatibility properties of additive and multiplicative surge in Section 5. Section 3.4 discusses the key differences in the two models, setting up Section 4 where we derive incentive compatible pricing functions.
3.1 Single-state model
In the single-state model, the primitives of our model directly induce a renewal reward process, where a given renewal cycle is defined as being from the time a driver is newly open to the time she is open again after completing a job. Let denote the mean earnings on trips , i.e. the expected earning in a renewal cycle; let be the sum of the expected wait time to an accepted trip and the expected length of a trip, and thus the expected renewal cycle length. Then, the lifetime mean hourly earnings (earnings rate) for a driver is given by
denotes the probability of the driver receiving a job request. The first equality follows from the Renewal Reward Theorem, and holds with probability 1 (see, e.g., Gallager (2013)). We view the earnings rate as a constraint for the platform in its pricing. Given some demand model, the platform receives revenue at a rate that depends on the prices it sets for riders. This revenue rate yields a earnings payout rate target , at which the platform needs to pay drivers. Then, the platform task is to find an incentive compatible pricing function such that the actual earnings rate for drivers who accept every trip meets the target, . This is close to how decoupled surge pricing is set in practice, where the revenue from the rider-surge is viewed as a target for average driver surge earnings.
Our first result is that, in the single-state model, the driver’s optimal policy has a simple form.
 In the single-state model, for each there exists a constant such that the policy is optimal for the driver with respect to .
Theorem 3.1 establishes that, in a single-state model with Poisson job arrivals, the length of the job is not important, only the hourly rate while busy on the job. Note, however, that the optimal in the policy is not necessarily : drivers must trade off the earnings rate while on a trip with their utilization rate; the more trips that a driver rejects, the longer she must wait for an acceptable trip. In the appendix we prove the result by, starting at an arbitrary policy , making changes to the policy that increase the earnings rate while on a job without decreasing the utilization rate. Thus, each such change improves the reward , and the sequence of changes results in a policy of the above form, for some . Then, this minimum on-job earnings rate can be optimized, leading to an optimal policy of this form.
An immediate corollary of Theorem 3.1 is that , for , is IC. In other words, if the platform pays a constant hourly rate to busy drivers then in the single-state model it is in the driver’s best interest to accept every trip. This result is driven by the following insight: while receiving long trip requests is more beneficial to drivers in the single-state setting as they increase one’s utilization rate (the driver is busy for a longer time until her next open period), rejecting short trips to cherry-pick long trips decreases utilization by the same amount.777As discussed in the related work, this corollary and insight is similar to a result of Kamble (2018); however, the proof is more involved in our setting as a driver’s strategy is a subset of denoting the job requests she accepts, as opposed to a discrete set of prices she charges. Further, in our setting, the driver responds to the platform’s prices instead of setting her own prices, enabling a wider range of IC pricing mechanisms. Further note that, given a earnings rate target , calculating the multiplier and thus an IC pricing policy is trivial.
On the other hand, affine pricing may not be incentive compatible because short trips are worth more per hour than are long trips: . The optimal policy may be to accept trips in for some . However, our next proposition establishes that affine pricing is incentive compatible if the additive component stays small enough as a function of the request arrival rate:
 In the single-state model, is incentive compatible if .
The sufficient condition has a simple intuition: when open, the expected amount of time the driver must wait for her next request is ; if on-trip time is valued at per unit-time, then with the additive component can be interpreted as paying for the driver’s expected waiting time. Thus, while a driver may earn more per hour for a short trip than a long trip with affine pricing, such a short trip is not worth the time the driver must wait for her next trip request. We further note that the condition in the proposition is not a necessary one; however, deriving necessary and sufficient conditions in closed form requires specifying the trip distribution .
As we’ll see in the next sub-section, the structure of optimal driver policies in reaction to affine pricing differs sharply in the dynamic model.
3.2 Dynamic model
We start our analysis of the dynamic model by characterizing the lifetime driver earnings rate, . Here, we can no longer directly use the renewal reward theorem as in the single-state model, with a renewal cycle containing just a single trip. The driver’s earning on a given trip is no longer independent of her earnings on other trips: given a job that starts in the surge state, the driver’s next job is more likely to start in the surge state. Given whether each job started in the surge state, however, job earnings are independent. We can use this property to prove our next lemma, which gives the mean hourly reward overall in the dynamic model.  The overall earning rate can be decomposed into the earnings rate and fraction of time spent in state . The following equality holds with probability :
As in the single-state model, where
We prove the result as follows. We define a new renewal process, in which a single reward renewal cycle is: the time between the driver is open in state 1 to the next time the driver is open in state 1 after being open in state 2 at least once. In other words, each renewal cycle is composed of a number of sub-cycles in which the driver is open in state 1 and then is open in state 1 again after a completed trip; one sub-cycle which starts with the driver open in state 1 and ends with her open in state 2 (either after a completed trip or a state transition while open); a number of sub-cycles in which the driver is open in state 2 and then is open in state 2 again after a completed trip; and finally one sub-cycle starting in state 2 and ending with the driver open in state .
Now, given the number of such large renewal reward cycles completed up to each time , the total earnings on trips starting in each state (earnings in each sub-cycle) are independent of each other, resulting in the separation. Then, we use Wald’s identity (Wald 1973) to separate and .
Note that is not exactly the expected length of time in a single sub-cycle in a state given , but rather is proportional to it; the multiplicative constant cancels out with the same constant in the expected earnings in a single sub-cycle in a state given . This constant emerges from our surge primitives: when the driver is open in state , there are two competing exponential clocks (with rates and , respectively) that determine whether the driver will accept a trip request in state before the world state changes to state . Furthermore, we can now precisely define what it means for to be the surge state: it has a higher potential earning rate than state ; such that . This assumption is not the same as , and neither implies the other; it is a condition jointly on , (but not ), indicating that a driver can, following some policy, earn more while in the surge state than she can following any policy in the non-surge state. Throughout, we set as the surge state according to this definition. When constructing pricing functions , we will also require the following constraint to be met: , for some exogenous , analogously to the single-state model constraint.
What does look like? We defer showing the exact form to Section 4.1 in advance of developing incentive compatible pricing. Here, we provide some intuition: the trips that a driver accepts in each state determines the portion of her time she spends on trips started in each state. For example, if a driver never accepts trips in the non-surge state, she will be open and thus available for a trip as soon as surge begins. Inversely, if a driver accepts a long surge trip immediately before surge ends, she will be paid according to the surge payment function even though surge has ended. Surprisingly, given the complex formulation of the reward as it depends on , we can find the structure of optimal policies as they depend on the pricing structure , as well as incentive compatible pricing functions. We begin this analysis in the next subsection, deriving optimal driver responses to multiplicative and affine pricing.
3.3 Driver’s Best-Response Strategy to Affine Prices
In the single-state model, multiplicative pricing is incentive compatible; a driver cannot benefit in the future by rejecting certain trips if all trips have the same hourly earning rate. In contrast, we now show that the same insight does not hold for the dynamic model, as a driver can influence her future trips through her decision to accept or reject certain trips.  Consider pricing . Then, there exists an optimal policy i.e. that maximizes , defined with parameters , such that
Non-surge state driver optimal policy :
If is multiplicative or positive affine, rejects long trips, i.e. .
If is negative affine, rejects short and long trips, i.e. .
Surge state driver optimal policy :
If is multiplicative or negative affine, rejects short trips, i.e. .
If is positive affine, rejects medium length trips, i.e. .
Furthermore, there exist settings where ’s take positive finite values. Hence, multiplicative pricing is not incentive compatible in general. We discuss the intuition in the next section. In the appendix, we prove the result for each case as follows: fixing for , we start with an arbitrary open set , recalling that open sets can be written as a countable union of such disjoint intervals. Then, we find , the derivative of the set function with respect to one of the interval upper end-points of , i.e. . This derivative is the infinitesimal change in the overall reward if is expanded by increasing , and it has useful properties. In the surge state with multiplicative pricing, for example, has the same sign as a function that is increasing in , for each fixed . With affine pricing, it has the same sign as a quasi-convex (positive affine in the surge state) or quasi-concave (negative affine in the non-surge state) function in , for a fixed . Such properties enable constructing a sequence of changes to that each do not decrease the reward , with the limit being a policy of the appropriate form. In particular, we can show that any policy that is not of the appropriate form above has for some , allowing local improvements until adjacent intervals can be combined or expanded to infinity.
Further note that the results of rejecting long trips in non-surge (and short trips in surge) extend to arbitrary pricing functions where is non-increasing (respectively, is non-decreasing), as the same properties hold. The other two results do not hold with such generality, as the behavior of the derivative may be arbitrarily complex.
3.4 Why is multiplicative surge pricing not incentive compatible?
|“I thoroughly dislike short trips ESPECIALLY when I’m picking up in a waning surge zone”|
What explains the difference between multiplicative pricing being incentive compatible in the single-state model but not in the dynamic model? In the latter, a driver’s policy affects not just her earnings while she is busy, but also the fraction of her time at which she is busy during the lucrative surge state. In particular, it turns out, accepting short trips during surge may reduce the amount of time that a driver is on a surge trip! Figure 2 shows in an example how the fraction of time in the surge state changes as a function of how many short trips the driver rejects. The anonymous driver we quote above identifies the key effect: when surge is short-lived, a driver may only have the chance to complete one surge trip before it ends. Thus, the driver may be better off waiting to receive a longer trip request, as with multiplicative surge she is paid a higher rate for the full duration of the longer trip. (Of course, there is a trade-off as if she rejects too many trip requests, she may not receive any acceptable request before surge ends). In the surge state, then, multiplicative pricing does not compensate drivers enough to accept short trips that may reduce their future surge earnings. In the non-surge state, analogously, multiplicative pricing under-compensates long trips that may prevent taking advantage of a future surge.
Affine pricing is a first, reasonable attempt at fixing these issues. In the surge state, the additive value makes the previously under-compensated short trips comparatively more valuable, as the earnings per unit time (with ) are now higher for short trips. Unfortunately, with such pricing the structure for the surge optimal policy becomes – if the values are not balanced correctly, the additive value is enough to make accepting extremely short trips profitable; for medium-length trips , however, the additive value is not large enough to make up for the fact that accepting the trip prevents accepting another surged trip before surge ends. Similarly, negative affine pricing in the non-surge state, , (with ) is now too harsh on very short trips but potentially not enticing enough for long trips.
We expand further on such effects in the next section, where we fix these issues by constructing true incentive compatible pricing schemes for both states. Then, in Section 5 we use the structural results derived in this section to perform numerical simulations comparing (approximate) incentive compatibility of additive and multiplicative surge.
4 Incentive Compatible Surge Pricing
In this section, we present our main result regarding the structure of incentive compatible pricing in the dynamic model. As discussed earlier, consistent with decoupled pricing, we distinguish between the two states by assuming that the surge state has a higher potential earnings rate than the non-surge state. We further require earning rate constraints as before: the platform must set pricing function to satisfy , i.e. the state earnings rate for drivers who accept every trip in that state is set to , for (exogenously) given . We focus on the driver’s decision on accepting or rejecting a trip, observing that this decision depends on the “opportunity cost” of performing that trip. A driver’s expected value for accepting a trip is the payout from that trip, plus the continuation value which depends on the world state when the trip is completed and she becomes open again. If the driver rejects a trip, she can accept other trips during the time it would have taken to complete that trip. If the state transitions in the meantime, some of those trips could be lucrative surge trips. Intuitively, we find incentive compatible prices that compensate drivers for such opportunity costs.
To this aim, in Section 4.1, we characterize, , how much time the driver spends in each state. In Section 4.2, we present incentive compatible prices, under mild conditions on the ratio of earning rates between the two states. Section 4.3 contains a discussion on the intuition of the IC pricing structure in terms of the driver’s opportunity cost, and Section 4.4 contains a proof sketch.
4.1 Transition probabilities and expected time spent in each state
The expected fraction of time spent in each state, , depends both on the evolution of the world state and the trips a driver accepts in each state. In order to quantify the effects previewed in Section 3.4, we first need to analyze the evolution of the CTMC that determines the surge state.
 Suppose the world is in state at time . Let denote the probability that the world will be in state at time . Then,
Note that is not just the probability that the world state transitions once during time
, but the probability that it transitions an odd number of times. This formulation emerges through a standard analysis of two-state CTMCs, in which this probability can be found through the inverse of the Laplace transform of the inverse of the resolvent of the Q-matrix for the system. Incorporating this value in closed form is the main hurdle in extending our results to general systems with more than two states. Using this formulation, the following lemma shows.
 Let be as defined in Lemma 3.2. The fraction of time a driver following strategy spends either open in state or on a trip started in state is
We prove this lemma by finding the expected number of sub-cycles in each state , i.e., within a larger renewal reward cycle as defined, the expected number of sub-cycles that start with the driver being open in state
. This expectation is the mean of a geometric random variable parameterized by the probability that the driver will next be open in state, given she is currently open in state . is proportional to this probability. (As in the case of , there is a normalizing constant ); the larger it is, the fewer sub-cycles that are spent in state . It has two components: the first is the probability that the state changes before the driver accepts a trip request; the second is the probability that the world state is when the driver completes a trip. Thus, the numerator in is proportional to the length of a sub-cycle in state , times the fraction of such cycles that are started in state . The larger or is, the more time the driver spends in state .
4.2 Incentive Compatible pricing
How can the platform create incentive compatible pricing given the previously described effects? Our main result establishes when such IC prices exist, and reveals their form.
 Let and be target earning rates during non-surged and surge states, respectively. There exist prices of the form
where but may be either positive or negative, such that the optimal driver policy is to accept every trip in the surge state and all trips up to a certain length in the non-surge state. Furthermore, for , there exist fully incentive compatible prices of this form, where
and , and .
We discuss a sketch of the proof in Section 4.4, with the complete proof in the appendix. To convey intuition, Figure 2(a) shows pricing functions in each state, plotting against . Compared to multiplicative pricing with constant , IC surge pricing pays more for short trips, to compensate drivers for their opportunity cost, and less for short trips. Inversely, IC non-surge pricing pays more for long trips than it does for short trips. Further, as increases, approaches , reflecting the fact that the opportunity cost for long trips does not depend as strongly on the state in which it started (as discussed in Section 4.3). Next, observe that IC surge pricing is approximately affine: (plotted in Figure 2(b)) is upper bounded by , and so is eventually approximately constant even as increases. The two components of pricing, and , thus balance the comparative benefit of long and short trips.
We note that, rather surprisingly and contrary to the focus of platform designers, it is the non-surge state that is difficult to make incentive compatible. Our result establishes that there always exist pricing schemes, for any target driver earning rates , such that accepting every trip in the surge state is driver optimal; we cannot say the same for the non-surge state. We give further intuition for the form of the pricing function and the range in the next subsection, showing how they emerge from the driver’s opportunity cost.
Finally, for a given feasible , there is a range of that form an incentive compatible pricing scheme. Why? If a driver rejects a trip request, she waits to receive another request, during which time she does not earn. This wait time tilts the driver toward accepting any trip request to maximize earnings. Thus, there is flexibility in the balance between short and long trip earnings. The same insight drives Proposition 3.1; even in the single-state model, trips do not have to have the same earnings per unit time, , as long as they meet some minimum threshold, .
4.3 Opportunity cost intuition for incentive compatible pricing
We now present some intuition to understand Theorem 4.2 and our incentive compatible pricing scheme. To do so, we introduce a weaker property than incentive compatibility, trip indifference: given a specific trip request length , in expectation the driver is at least as well off accepting the request as she is rejecting it, assuming she will accept any future request.888If is incentive compatible, then it also satisfies trip indifference. The latter criterion is a local condition in which the driver cannot infinitesimally improve her earnings by instead following strategy (“infinitesimally”, as trip length has measure by assumption); in contrast incentive compatibility is a global condition on the space of driver policies, requiring that the policy that accepts all trips globally maximize the earnings rate, i.e. , for all .
Trip indifference allows us to illustrate the various features that must be incorporated into any IC pricing scheme: given an accepted trip request, it is simple in our model to formulate theoretically the driver’s counter-factual expected earnings in a certain time window if she had instead rejected the request, i.e. her opportunity cost for accepting the trip.999 Similarly, it is easier to measure empirically: counter-factual driver earnings for a accepted trip request can be approximated by measuring the future earnings of other nearby drivers who did not receive that trip request. Verifying incentive compatibility, on the other hand, requires full off-policy learning and estimation.
Similarly, it is easier to measure empirically: counter-factual driver earnings for a accepted trip request can be approximated by measuring the future earnings of other nearby drivers who did not receive that trip request. Verifying incentive compatibility, on the other hand, requires full off-policy learning and estimation.Intuitively, the amount that the driver is paid for the trip must account for this opportunity cost, i.e. in a VCG-like manner. Of course, this opportunity cost itself depends on the pricing scheme . We now break down parts of this opportunity cost.
On-trip opportunity cost.
While the driver is on-trip, the world state continues to evolve: surge might end or start, and such changes affect the opportunity cost. We call this component the “network minutes” cost.
Let be the expected amount of time that the world is in state during time , given that it is in state at time . Then, by integrating from to :
What does this tell us about the opportunity cost? Let us define as the driver’s earnings rate while the world state is (whether the driver is open, or on a trip that started in either state). is close to but not exactly , which instead is the earnings rate counting open time and trips that start in state . Then, the driver’s opportunity cost during time , starting in state is
Though is not a simple expression in terms of , several insights emerge:
One. The network minutes opportunity cost is of the form, , for some . This matches the shape of our IC pricing scheme, which has different that incorporate complications ignored here.
Two. As trip length , the first component dominates the opportunity cost. This component is the same whether the given trip request starts in state or , i.e. the stationary distribution of a positive recurrent CTMC does not depend on the starting state. This fact implies that we cannot always construct incentive compatible prices, for any : as , the trip’s opportunity cost does not depend on the starting state , and so the trip’s payments must be similar, . When all trips in the non-surge state are long, i.e. is concentrated around large values, the earnings rate in each state must be similar, .
exactly encodes such constraints, as shown in Figure 4. As the mean of goes to , then and so , and so the range of feasible expands. Similarly, also plays a large role. When small, the surge state is long. Thus, regardless of how long a driver’s last non-surge trip is, she will receive many trips during surge – and so long trips during non-surge are no longer constrained to be highly paid compared to short trips.
Continuation value opportunity cost
The previous discussion misses a crucial detail: it is not sufficient to consider just the opportunity cost for the duration of the trip. A driver’s counter-factual earnings by rejecting the trip depends on future trips that she accepts. Such counter-factual trips both (1) pay the driver according to their starting state even after a world state transition, i.e. the difference between and above; and (2) potentially are still in progress past time , when the current trip ends. This second complication is illustrated in Figure 2, where a driver can extend the time she spends on trips starting in the surge state by rejecting short surge trips. The effect depends on the lengths of future potential trips, i.e. , and state transitions during those trips, , and is incorporated in both and the pricing scheme.
4.4 Proof sketch of Theorem 4.2
The result is shown in the appendix by manipulating the derivative of the reward function with respect to the policy . In particular, when the pricing function is of the given form with the appropriate constants , then any policy can be locally improved by adding more trips to it, i.e. the overall reward is non-decreasing as the driver accepts more trips: . This result follows from , for all , given the constraints, where is an upper endpoint of the policy in a state, .
The key step is finding sufficient constraints for this derivative to be positive with a pricing function of the given form, given any , as opposed to just . This difficulty emerges because incentive compatibility is a global condition on the set function . In particular, we need to express these constraints simply – e.g. as a function of just , instead of the values . The presented in the theorem statement results from such a set of constraints on .
5 Approximate Incentive Compatibility with Additive Surge
We now analyze surge policies that reflect practice at ride-hailing platforms today, as they are simple to communicate through a heat-map. Non-surge pricing is typically purely multiplicative, i.e , where is the base time (and distance) rate for a ride. We consider two types of affine surge pricing , which differ in their relationship to through a single parameter:
In multiplicative surge, a higher multiplier is used than the base fare , and is reported on the heatmap as in Figure 0(a); in additive surge, the same base fare multiplier on the trip length is used in both surge and non-surge times, with an additive factor during surge that is reported on the heatmap in Figure 0(b). These surge functions are simple to calculate, given fixed primitives and target earnings rate in the surge state: or are determined given these values.
Figure 5 shows these types of pricing, compared to the incentive compatible pricing function. Multiplicative surge has constant and so under-pays short trips and over-pays long-trips compared to IC pricing. Additive surge asymptotically (for large ) pays the same as multiplicative non-surge pricing, i.e. . As a result, it over-pays short trips and under-pays long trips compared to IC surge pricing.
Uber has recently started a transition from multiplicative to additive surge. In this section, we argue that the additive component is more important than the multiplicative component for incentive compatibility, motivating Uber’s transition.
Computing optimal driver policies
Recall that Theorem 3.3 establishes that multiplicative pricing (and, more generally, affine pricing) may not be incentive compatible. However, we still wish to compare the various types of surge pricing. We thus compare how approximately compatible these pricing functions are, in the sense of what fraction of trips is accepted by an optimal driver policy with respect to the pricing function.
However, to do this comparison, one needs to calculate optimal driver policies with respect to a pricing function. Recall that the optimal driver policy in each state is some subset of . Finding such optimal subsets for general pricing functions is computationally intractable. Thus, Theorem 3.3 is particularly important for computational reasons. It establishes that, for any affine pricing structure in the surge state, there exists a driver optimal policy of the form , for some . Thus, we only need to find the values for these parameters that maximize the driver reward among sets of this form, and the resulting policy is optimal. Deriving closed forms is still intractable, but we can computationally find them through grid search and numeric integration. Note that the proposition does not establish uniqueness of the driver optimal policy; we thus choose the policy that maximizes the fraction of trips accepted in our computations.
Approximate incentive compatibility
We now study how (approximately) incentive compatible the surge mechanisms are, i.e. the fraction of trips accepted in the surge state by the optimal policy. Figure 6 shows how this fraction changes with the primitives. The shaded regions correspond to areas where the surge pricing function is fully incentive compatible in the surge state ( is optimal), and the lines are contour lines for approximate incentive compatibility, indicating the fraction of trips accepted. For example, when , then about are accepted with additive surge, and are accepted with multiplicative surge.
Overall, we note that additive surge is far more approximately incentive compatible in the most common parameter regimes for ride-hailing platforms such as Uber: (1) surge is between and times more valuable than non-surge; (2) surge is short-lived compared to non-surge periods (); (3) and in a typical surge the driver will be able to receive several trip requests (, but small) but may only be able to complete one or two such trips ( mean trip length). In each of these regimes, additive surge is either fully incentive compatible or more approximately IC than is multiplicative surge. For example, with (i.e. a surge multiplier of ), every trip is accepted with additive surge for any in our range, whereas up to of trips are rejected with multiplicative surge. These simulations thus support Uber’s recent shift from multiplicative to additive surge. We can also draw qualitative insights in terms of sensitivity to the primitives, similar in spirit to effects in the form of in Theorem 4.2.
Figure 5(a) shows how the approximate IC properties of additive and multiplicative surge change with and . As the arrival rate of jobs in the surge state, , increases, both types of surge become less incentive compatible: “cherry-picking” becomes easier, as the driver is likely to receive many more trip requests before surge ends. Similarly, as surge becomes increasingly more valuable compared to non-surge ( increases), the incentive to reject non-valuable trips in the surge state increases (short trips with multiplicative surge, long trips with additive surge).
For additive surge, an interesting non-monotonicity with : when , the effect above dominates, and long trips are rejected. When the surge state is moderately more valuable than non-surge, additive surge effectively balances the payments for different trip lengths, and so is incentive compatible. When the two states become almost equally valuable, however, again the optimal driver policy with additive surge rejects long trips: the system approximates our single-state model, and so additive surge may not be incentive compatible, cf. Theorem 3.1.
Figure 5(b) shows the effects of the relative lengths of surge and non-surge. Note that, here, the two types of surge are incentive compatible in exactly opposing regimes. When is large, surge is comparatively rare and short, and so short trips are naturally undervalued — accepting them decreases the time spent in the surge state — and additive surge is more incentive compatible. With long-lasting surge (small ), on the other hand, the world almost seems unchanging in the surge state, and so multiplicative surge nears incentive compatibility. In modern ride-hailing platforms, the scenario with short, in-frequent surge is more common.
In this work, we studied the problem of designing incentive compatible mechanisms for ride-hailing marketplaces. We presented a dynamic model to capture essential features of these environments. Even-though our model is simple and stylized, it highlights how driver incentives and subsequently dynamic pricing strategies would change in the presence of stochasticity. We hope our work inspires other researchers in this area to incorporate such uncertainty in their models, as it is one of the biggest challenges faced in practice.
An important direction for extending our work is studying matching and pricing polices jointly, i.e. how to best match open drivers to riders in the presence of such effects, cf. (Özkan and Ward 2016, Banerjee et al. 2017, Feng et al. 2017, Zhang et al. 2017, Banerjee et al. 2018, Hu and Zhou 2018, Korolko et al. 2018, Ashlagi et al. 2018, Kanoria and Qian 2019). In this work, we look at incentive compatible pricing. The platform, in addition to pricing, can use matching policy to align incentives.
We would like to thank Uber’s driver pricing data science team, in particular Carter Mundell, Jake Edison, Alice Lu, Michael Sheldon, Margaret Tian, Qitang Wang, Peter Cohen, and Kane Sweeney for their support and suggestions without which this work would have not been possible. We also thank Leighton Barnes, Ashish Goel, Ramesh Johari, Vijay Kamble, Hannah Li, and Virag Shah. This work was funded in part by the Stanford Cyber Initiative, the Office of Naval Research grant N00014-15-1-2786, and National Science Foundation grant 1544548.
- Afèche et al. (2018) Afèche P, Liu Z, Maglaras C (2018) Ride-Hailing Networks with Strategic Drivers: The Impact of Platform Control Capabilities on Performance. SSRN Electronic Journal ISSN 1556-5068, URL http://dx.doi.org/10.2139/ssrn.3120544.
- Ashlagi et al. (2018) Ashlagi I, Burq M, Jaillet P, Saberi A (2018) Maximizing efficiency in dynamic matching markets. arXiv preprint arXiv:1803.01285 URL https://arxiv.org/pdf/1803.01285.pdf.
- Ata et al. (2019) Ata B, Barjesteh N, Kumar S (2019) Spatial Pricing: An Empirical Analysis of Taxi Rides in New York City. Working Paper .
- Bai et al. (2018) Bai J, So KC, Tang CS, Chen XM, Wang H (2018) Coordinating Supply and Demand on an On-Demand Service Platform with Impatient Customers. Manufacturing & Service Operations Management URL http://dx.doi.org/10.1287/msom.2018.0707.
- Banerjee et al. (2017) Banerjee S, Gollapudi S, Kollias K, Munagala K (2017) Segmenting two-sided markets. Proceedings of the 26th International Conference on World Wide Web, 63–72.
- Banerjee et al. (2018) Banerjee S, Kanoria Y, Qian P (2018) State Dependent Control of Closed Queueing Networks with Application to Ride-Hailing URL http://arxiv.org/abs/1803.04959.
- Banerjee et al. (2015) Banerjee S, Riquelme C, Johari R (2015) Pricing in Ride-Share Platforms: A Queueing-Theoretic Approach. SSRN Electronic Journal ISSN 1556-5068, URL http://dx.doi.org/10.2139/ssrn.2568258.
- Bertsimas and van Ryzin (1991) Bertsimas DJ, van Ryzin G (1991) A Stochastic and Dynamic Vehicle Routing Problem in the Euclidean Plane. Operations Research 39(4):601–615, ISSN 0030-364X, 1526-5463, URL http://dx.doi.org/10.1287/opre.39.4.601.
- Bertsimas and van Ryzin (1993) Bertsimas DJ, van Ryzin G (1993) Stochastic and Dynamic Vehicle Routing in the Euclidean Plane with Multiple Capacitated Vehicles. Operations Research 41(1):60–76, ISSN 0030-364X, 1526-5463, URL http://dx.doi.org/10.1287/opre.41.1.60.
- Besbes et al. (2018a) Besbes O, Castro F, Lobel I (2018a) Spatial Capacity Planning. SSRN Electronic Journal ISSN 1556-5068, URL http://dx.doi.org/10.2139/ssrn.3292651.
- Besbes et al. (2018b) Besbes O, Castro F, Lobel I (2018b) Surge Pricing and Its Spatial Supply Response. SSRN Electronic Journal ISSN 1556-5068, URL http://dx.doi.org/10.2139/ssrn.3124571.
- Bimpikis et al. (2016) Bimpikis K, Candogan O, Saban D (2016) Spatial Pricing in Ride-Sharing Networks. SSRN Scholarly Paper ID 2868080, Social Science Research Network, Rochester, NY, URL https://papers.ssrn.com/abstract=2868080.
- Buchholz (2017) Buchholz N (2017) Spatial Equilibrium, Search Frictions and Efficient Regulation in the Taxi Industry URL https://scholar.princeton.edu/sites/default/files/nbuchholz/files/taxi_draft.pdf.
- Cachon et al. (2017) Cachon GP, Daniels KM, Lobel R (2017) The Role of Surge Pricing on a Service Platform with Self-Scheduling Capacity. Manufacturing & Service Operations Management 19(3):368–384, ISSN 1523-4614, URL http://dx.doi.org/10.1287/msom.2017.0618.
- Castillo et al. (2017) Castillo JC, Knoepfle D, Weyl G (2017) Surge Pricing Solves the Wild Goose Chase. 241–242 (ACM Press), ISBN 978-1-4503-4527-9, URL http://dx.doi.org/10.1145/3033274.3085098.
- Chen and Sheldon (2016) Chen MK, Sheldon M (2016) Dynamic Pricing in a Labor Market: Surge Pricing and Flexible Work on the Uber Platform URL http://dx.doi.org/10.1145/2940716.2940798.
- Chen and Hu (2018) Chen Y, Hu M (2018) Pricing and Matching with Forward-Looking Buyers and Sellers. SSRN Scholarly Paper ID 2859864, Social Science Research Network, Rochester, NY, URL https://papers.ssrn.com/abstract=2859864.
- Cook et al. (2018) Cook C, Diamond R, Hall J, List J, Oyer P (2018) The Gender Earnings Gap in the Gig Economy: Evidence from over a Million Rideshare Drivers URL http://dx.doi.org/10.3386/w24732.
- Cramer and Krueger (2016) Cramer J, Krueger AB (2016) Disruptive Change in the Taxi Business: The Case of Uber. American Economic Review 106(5):177–182, ISSN 0002-8282, URL http://dx.doi.org/10.1257/aer.p20161002.
- Feng et al. (2017) Feng G, Kong G, Wang Z (2017) We are on the way: Analysis of on-demand ride-hailing systems URL https://dx.doi.org/10.2139/ssrn.2960991.
- Gallager (2013) Gallager RG (2013) Stochastic Processes: Theory for Applications (Cambridge University Press), ISBN 978-1-107-03975-9.
- Guda and Subramanian (2019) Guda H, Subramanian U (2019) Your uber is arriving: Managing on-demand workers through surge pricing, forecast communication, and worker incentives. Management Science 65(5):1995–2014, URL http://dx.doi.org/10.1287/mnsc.2018.3050.
- Hall et al. (2017) Hall JV, Horton JJ, Knoepfle DT (2017) Labor Market Equilibration: Evidence from Uber URL https://eng.uber.com/research/labor-market-equilibration-evidence-from-uber/.
- Hall et al. (2015) Hall JV, Kendrick C, Nosko C (2015) The effects of Uber’s surge pricing: A case study URL https://eng.uber.com/research/the-effects-of-ubers-surge-pricing-a-case-study/.
- Hu and Zhou (2018) Hu M, Zhou Y (2018) Dynamic type matching. Rotman School of Management Working Paper (2592622).
- Kamble (2018) Kamble V (2018) Revenue Management on an On-Demand Service Platform URL http://arxiv.org/abs/1803.06797.
- Kanoria and Qian (2019) Kanoria Y, Qian P (2019) Near Optimal Control of a Ride-Hailing Platform via Mirror Backpressure URL http://arxiv.org/abs/1903.02764.
- Korolko et al. (2018) Korolko N, Woodard D, Yan C, Zhu H (2018) Dynamic Pricing and Matching in Ride-Hailing Platforms. SSRN Electronic Journal 40, ISSN 1556-5068, URL http://dx.doi.org/10.2139/ssrn.3258234.
- Lu et al. (2018) Lu A, Frazier PI, Kislev O (2018) Surge Pricing Moves Uber’s Driver-Partners. Proceedings of the 2018 ACM Conference on Economics and Computation, 3–3, EC ’18 (New York, NY, USA: ACM), ISBN 978-1-4503-5829-3, URL http://dx.doi.org/10.1145/3219166.3219192.
- Ma et al. (2018) Ma H, Fang F, Parkes DC (2018) Spatio-Temporal Pricing for Ridesharing Platforms URL http://arxiv.org/abs/1801.04015.
- Özkan and Ward (2016) Özkan E, Ward A (2016) Dynamic Matching for Real-Time Ridesharing. SSRN Electronic Journal ISSN 1556-5068, URL http://dx.doi.org/10.2139/ssrn.2844451.
- Uber (2019a) Uber (2019a) Community Guidelines. URL https://www.uber.com/legal/community-guidelines/us-en/.
- Uber (2019b) Uber (2019b) Dependable Earnings. URL https://www.uber.com/drive/resources/dependable-earnings/.
- Uber (2019c) Uber (2019c) How are fares calculated. URL https://help.uber.com/riders/article/how-are-fares-calculated?nodeId=d2d43bbc-f4bb-4882-b8bb-4bd8acf03a9d.
- Uber (2019d) Uber (2019d) New Driver Surge. URL https://www.uber.com/blog/your-questions-about-the-new-surge-answered/.
- Uber (2019e) Uber (2019e) Service Fee. URL https://marketplace.uber.com/pricing/service-fee.
- Wald (1973) Wald A (1973) Sequential analysis (Courier Corporation).
- Yang et al. (2018) Yang P, Iyer K, Frazier P (2018) Mean field equilibria for resource competition in spatial settings. Stochastic Systems 8(4):307–334.
Zhang et al. (2017)
Zhang L, Hu T, Min Y, Wu G, Zhang J, Feng P, Gong P, Ye J (2017) A taxi order dispatch model based on combinatorial optimization.Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2151–2159 (ACM).
7 Proofs of main text theorems and lemmas
In this Appendix, we provide proofs of the theorems and lemmas in the main text. The proofs rely on claims proved in Appendix Section B.
7.1 Notation and assumptions
In general for the dynamic model, we use in the function argument when the function depends on policies in both states, and when it only depends on the policy in state .
In the dynamic model, let .
All policy equalities are up to measure .
We use to denote has the same sign as, rather than proportional to
Assumptions (repeat from main text).
Distribution of jobs is a continuous probability measure, i.e. bounded.
Payment functions are continuous.
We assume that there exists a policy in state 2 that dominates state 1: such that .
constrained to be measurable with respect to , and are open.
7.2 Single-state model theorem and propositions proofs
7.2.1 Proof of Theorem 3.1
We now prove Theorem 3.1, regarding the form of the optimal policy in the single-state model – where the length of a trip does not matter, only the earnings rate. The optimal policy trades off the earnings rate while on a trip with the driver’s utilization rate. At a high level, the proof proceeds as follows: starting from any policy that is not of the appropriate form, we replace trips in the policy with those with a higher earnings rate, while keeping the utilization rate exactly the same. Such replacements result in a policy that is almost of the correct form, except there may be an earnings rate such that only a subset of is in the policy. The remainder of the proof is showing that such a policy can be improved to form a policy of the appropriate form.
Proof. Let . Assume that . Otherwise any policy is optimal and so the result is trivial.
Start at . We first show that there exists such that or such that . Assume that . (If is either or , we are done, as and is of the desired form.)
First we construct , where and .
For the given , let
is a set of trips that pay more than per unit time but are not in , and is the set of the trips that pay less than per unit time but are not in . is the mean extra utilization that trips in contribute in a renewal cycle. The idea is that if we find sets such that and , then : the denominator of the reward stays the same, and the numerator increases. A few facts that follow from assumptions:
is non-decreasing as increases, and
is non-increasing as increases, and
, both are continuous from the left in .
The above imply that such that .
Thus, there exists
If , then we are done with this part: let .
Otherwise if (which can happen if there is a point mass at ):
By , for all : . Then
let such that . Such exists by continuous.
We now have , where , and , unless already was of the form for some .
Next, we construct or such that .
Suppose . Then
Where the inequality follows from ,
, and .
Similarly, suppose . Then