Counterbalancing Learning and Strategic Incentives in Allocation Markets

by   Jamie Kang, et al.
Stanford University

Motivated by the high discard rate of donated organs in the United States, we study an allocation problem in the presence of learning and strategic incentives. We consider a setting where a benevolent social planner decides whether and how to allocate a single indivisible object to a queue of strategic agents. The object has a common true quality, good or bad, which is ex-ante unknown to everyone. Each agent holds an informative, yet noisy, private signal about the quality. To make a correct allocation decision the planner attempts to learn the object quality by truthfully eliciting agents' signals. Under the commonly applied sequential offering mechanism, we show that learning is hampered by the presence of strategic incentives as herding may emerge. This can result in incorrect allocation and welfare loss. To overcome these issues, we propose a novel class of incentive-compatible mechanisms. Our mechanism involves a batch-by-batch, dynamic voting process using a majority rule. We prove that the proposed voting mechanisms improve the probability of correct allocation whenever agents are sufficiently well informed. Particularly, we show that such an improvement can be achieved via a simple greedy algorithm. We quantify the improvement using simulations.



There are no comments yet.


page 1

page 2

page 3

page 4


Dynamic Mechanism Design for Markets with Strategic Resources

The assignment of tasks to multiple resources becomes an interesting gam...

Incentive Design in a Distributed Problem with Strategic Agents

In this paper, we consider a general distributed system with multiple ag...

Adaptive Incentive Design

We apply control theoretic and optimization techniques to adaptively des...

Truthful Cake Sharing

The classic cake cutting problem concerns the fair allocation of a heter...

Optimal Nash Equilibria for Bandwidth Allocation

In bandwidth allocation, competing agents wish to transmit data along pa...

Optimal Information Provision for Strategic Hybrid Workers

We study the problem of information provision by a strategic central pla...

Optimal Algorithm for Bayesian Incentive-Compatible

We consider a social planner faced with a stream of myopic selfish agent...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

This paper contributes to the growing literature in the intersection of learning and mechanism design, by considering a fundamental allocation problem in the presence of learning and strategic agents. In our setting, a planner needs to allocate a single indivisible object to at most one among several agents waiting in a queue. The object’s true quality is either high (“good”) or low (“bad”), but is ex-ante unknown to both the planner and the agents. Agents are privately informed, in the sense that each receives a private signal that is informative about the true quality of the object, and strategic since they choose to accept or reject the object based on the action maximizing their expected utility. In this framework, agents thus engage in a social learning process as they observe the actions of their predecessors but not their private signals. The planner’s goal is to design an incentive-compatible mechanism that optimizes correctness, i.e., maximizes the ex-ante accuracy of allocation.

This problem finds a natural application in the allocation of donated organs to patients on a national waitlist. The median waiting time for a first kidney transplant in the U.S. is around 3.6 years (National Kidney Foundation, 2016), whereas the rate at which deceased donor kidneys are offered to patients but eventually discarded has been growing steadily (Mohan et al., 2018), exceeding in 2020 (U.S. Department of Health and Human Services, 2021). The reasons leading to high discard rates – even of seemingly high-quality organs – remain unclear. Clinical studies attribute this increase to herding behavior and inefficiency in allocation mechanisms (Stewart et al., 2017; Cooper et al., 2019). Improving the allocation outcome of these markets is undoubtedly crucial as it could save thousands of future lives.

Towards understanding the source of high discard rates, our framework highlights the often overlooked role of social learning. In most cases, when offered an organ, patients not only have access to publicly available data about the organ (e.g., donor’s age, organ size) but also consult with their own physicians for private opinions. Each physician may assess the same organ differently based on their prior experience and/or medical expertise. Empirical works (De Mel et al., 2020; Zhang, 2010) have recently proposed an explanation that patients rationally ignore their own private information (e.g., a physician’s medical assessment) about the quality of an organ and follow the preceding patients’ rejections.

Our proposed model captures this phenomenon and serves as a theoretical sandbox for the design and assessment of alternative allocation policies. We theoretically show that sequentially offering the object to agents – which is a policy commonly used in practice – can result in poor correctness, since the rejections by the few initial agents can cause a cascade of rejections by the subsequent agents.

To address this issue, we introduce a novel class of incentive-compatible allocation mechanisms characterized by a batch-by-batch, dynamic voting process under a simple majority rule. Our proposed greedy mechanism dynamically partitions the list into batches of varying sizes. Then, it sequentially offers the object to each batch until the object is allocated to some agent. The planner privately asks each agent in the current batch to either opt in or opt out. If the majority of the current batch vote to opt in, then the planner randomly allocates the object to one of the current agents who opted in. If the mechanism exhausts the list without success, the object is discarded. Interestingly, several Organ Procurement Organizations (OPOs) already offer organs in batches, although no voting is used;111Based on authors’ personal communication. See, e.g., Mankowski et al. (2019) for allocation in batches. the reasons are both to speed up the allocation process and to avoid herding.

In our main result, we show that if private signals are at least as informative as the prior, there always exists an incentive-compatible voting mechanism that strictly improves correctness and thus helps avoid unwanted discards. Furthermore, we suggest a simple, greedy algorithm to implement such a mechanism. For all other values of private signal precision, we establish that there is no incentive-compatible voting mechanism. Moreover, every voting mechanism results in the same correctness as the sequential offering mechanism.

Incentive-compatible voting mechanisms have several interesting properties. First, the batch size increases after each failed allocation attempt. The intuition is simple. Failed allocations make the current belief about the object quality more pessimistic. Therefore, the planner increases the batch size to ensure that the object is allocated only if a sufficient number of agents receive positive signals while each agent’s decision remains pivotal. Second, offering to another batch after a failed attempt strictly improves correctness (compared to sequential offering). As our simulations showcase, this improvement is significant and generally becomes larger as the prior and signal precision grow; in fact, a voting mechanism with just two batches significantly outperforms sequential offering. Finally, we find that the prior and signal precision have opposite effects on correctness: the correctness tends to decrease as the prior increases but the marginal effect of signal precision is positive. An intuitive explanation is as follows. Since the optimal batch size decreases as the prior increases, the probability to incorrectly allocate a bad object increases. However, as the signal precision improves, the magnitude of the error decreases since agents’ signals become more reliable.

We highlight the tension between agents’ strategic incentives and the planner’s learning goal. From the planner’s perspective, agents’ private signals contain valuable information about the unknown quality of the object. By increasing the number of collected data points, the planner can improve her confidence about the true quality of the object. Agents’ strategic incentives, however, impose a constraint on the maximum size of the batch that allows the truthful elicitation of agents’ signals: as the batch size increases, the chance to be pivotal decreases which in turn reduces the incentive for an agent with a negative signal to be truthful. Our proposed voting mechanisms enable the planner to crowdsource agents’ private information in a simple, truthful, and effective manner.

Related literature. Our model draws upon seminal social learning papers Banerjee (1992); Bikhchandani et al. (1992); Smith and Sørensen (2000) and their extensions in various game structures (Acemoglu et al., 2010; Eyster et al., 2013; Mossel et al., 2015; Arieli and Mueller-Frank, 2019). Several empirical studies also examine social learning in organ allocation settings (Zhang, 2010; De Mel et al., 2020). However, none of these works consider the existence of a planner that controls the flow of information. Furthermore, our proposed mechanism is inspired by the voting literature (e.g., (De Condorcet, 1785; Austen-Smith and Banks, 1996)); our notion of correctness is adopted from Arieli et al. (2018). Finally, our paper is loosely connected to information design (e.g., (Kamenica and Gentzkow, 2011; Rayo and Segal, 2010; Papanastasiou et al., 2017; Che and Horner, 2015)) and Bayesian exploration (e.g., (Kremer et al., 2014; Immorlica et al., 2018; Mansour et al., 2016; Glazer et al., 2021; Immorlica et al., 2019)). In contrast to our model, this literature does generally not consider an underlying social learning process. We include an extended overview of the related literature in Appendix A.

2 Model

We consider a model in which a social planner wishes to allocate a single indivisible object of unknown quality to privately informed agents in a queue.

Object and agents. The object is characterized by a fixed quality , where and denote a good and bad quality, respectively. The true quality of the object is ex-ante unknown to both the planner and agents; however, both share a common prior belief

Agents are waiting in a queue; we denote by the agent in position Each each agent knows his own position. For simplicity, we assume that the queue consists of an arbitrarily large number of agents. Each agent has a private binary signal that is informative about the true quality of the object Each is identically and independently distributed conditional on Furthermore, each is aligned with with probability where the signal precision is commonly known.222The condition is without loss of generality. The common prior belief and signal precision are standard in the social learning literature (Banerjee, 1992; Bikhchandani et al., 1992; Smith and Sørensen, 2000).

Agents are risk-neutral. The utility of the agent who receives the object is if and if (This symmetry simplifies the exposition and analysis but does not qualitatively change the results.). Any agent who does not receive the object, because he either declines or is never offered, receives a utility of 0. As a tie-breaking rule we assume that indifferent agents always decline.

Voting mechanisms. The planner designs a mechanism which potentially asks (a batch of) agents to report their private signals, and based on their report, decides whether and how to allocate the object. We consider a class of voting mechanisms denoted by A voting mechanism

offers the object sequentially to odd-numbered batches of agents, where each batch size may be set dynamically based on the information from previous batches. In any given batch, if the majority of agents vote to opt in, the object is allocated uniformly at random to one of the current agents who opted in.

Formally, a voting mechanism is defined by a sequence of functions , such that for each batch the corresponding function maps a current belief about the object quality to a batch size (we let ). The mechanism begins with offering to the first batch of agents, who are in positions . If the object is not allocated to batch it is offered to the agents in the next positions.

When batch is offered the object, each agent in batch chooses an action (a vote) , where and correspond to opting in (“yes”) and opting out (“no”), respectively. If the number of opt in votes from batch , constitutes the majority of batch , i.e., , the object is allocated uniformly at random among agents in batch who opt in. To simplify exposition, we sometimes denote by the ex-post size of batch (while omitting the dependency on the current belief ). We assume that the agents in batch as well as the planner observe all the votes of agents in the previous batches , and this is commonly known. After any batch and given a current prior , the belief shared by the planner and agents is updated recursively in terms of the realized batch size and voting outcome as


A commonly applied voting mechanism is a sequential offering mechanism, which offers the object to one agent at a time. Another type of voting mechanism is single-batch voting mechanisms, which offer the object only once; thus, the object is discarded if the majority in the batch choose to opt out.

Let denote the expected utility of an agent who receives signal and takes action . A voting mechanism is incentive-compatible (IC) if and Thus, agent gets higher utility from opting in than opting out when he has received signal , but prefers to opt out when .

Correctness. The planner’s goal is to maximize the probability of a correct allocation outcome, or correctness; the allocation outcome is correct if it allocates a good object or does not allocate a bad object. Formally, under a voting mechanism

, the planner’s decision whether to allocate the object or not is denoted by the random variable

A mechanism achieves correctness if

Since we allocate a single object, a cascade can only form on the opt out action. Therefore, maximizing correctness is equivalent to maximizing social welfare.

Upper bound on correctness. A natural benchmark on correctness would be the optimal solution in a setting where agents are not strategic and thus are willing to truthfully reveal their private signal to the planner. In the absence of strategic incentives, the planner’s optimal solution would be to (i) ask all agents in the queue to reveal their private signals, (ii) compute her posterior belief based on the gathered information, and (iii) allocate the object to a random agent if and only if her posterior exceeds 0.5. Equivalently, as we show in Lemma 1 in Appendix B, the planner allocates the object if and only if the number of positive signals is at least where is the number of agents in the queue. In that case, we can show that correctness equals


where is a Binomial random variable (see also Lemma 2). As grows, correctness approaches 1. Importantly, this upper bound is not achievable by an incentive-compatible mechanism.

Discussion of modeling assumptions.

The model is based on several modeling choices that we discuss next. First, the assumption that voting outcomes are fully revealed to subsequent batches is made for simplicity. Dropping this knowledge complicates the analysis substantially: one should compute a posterior belief of an agent by taking expectations of possible outcomes that led to them being offered an object. Second, assuming that batches are always odd-numbered is of technical nature. To attain incentive compatibility, our analysis relies on the monotonic behavior of the upper and lower bounds on the prior. This result depends on a probabilistic argument about binomial distributions (Lemma 

2), which fails to hold if batches can be both of odd and even sizes.333This issue can be addressed by introducing a random tie-breaking rule with appropriate weights. Finally, we consider the allocation of just a single object rather than a setting in which objects arrive and are allocated sequentially. In particular we abstract away from dynamic incentives and restrict attention to the learning aspect. We leave the interaction between the two for future work.

3 Herding in the Sequential Offering Mechanism

In this section we analyze the commonly applied sequential offering mechanism, which will serve as a benchmark for our results. Recall that the sequential offering mechanism, namely , offers the object to each agent one-by-one; the first agent that is offered the object and opts in receives it.

As we establish below, the sequential offering mechanism has two important drawbacks: it is not incentive-compatible and achieves poor correctness. This is due to a cascade of actions (i.e., herding) that takes place after a few decisions. Specifically, a cascade of opt out actions begins after a couple of initial agents opt out, which ultimately leads to the discard of the object. The number of initial opt out actions needed to instigate this cascade-led discard depends on the prior The following result formalizes this. A detailed analysis of the benchmark mechanism can be found in Appendix C together with the omitted proofs of the section.

Lemma 1.

Under , the object is allocated if and only if: (i) , (ii) and either or , (iii) and Otherwise, the object is discarded.

Figure 1: Allocation outcome of the sequential offer mechanism () based on the value of prior with respect to signal precision For low the object is rejected by all agents, leading to discard. For intermediate the allocation outcome depends on the initial two agents’ private signals.

A direct implication of this result is that the outcome of the mechanism is determined in the first two offers. Regardless of the values of , and the object’s true quality, the object is always discarded if it is rejected by the first two agents. Thus the result below follows.

Proposition 1.

Under the sequential offering mechanism, :

  • [noitemsep,topsep=-0.5em]

  • At most the two initial agents in the queue determine the allocation outcome: if the first two agents decline, then the object will be always discarded.

  • The correctness equals

The result qualitatively suggests that the first few444We have assumed that agents share a constant prior In practice, each agent’s prior might be drawn i.i.d. from some common distribution with support . Our qualitative insights still extend to this case: a larger but finite number of initial agents determines the allocation outcome and herding still occurs. agents in the queue have the power to decide whether the object will be allocated. This implies that these few agents can inadvertently undermine the welfare of the rest of the agents. Most importantly, the planner can never learn from the private information of the remaining agents to make better allocation decisions. In the context of organ transplants, this herding behavior prevents the planner from allocating good-quality organs, leading to a high discard rate. This may in turn lead to patients’ longer waiting times and health deterioration.

4 Voting Mechanisms

Recall from Equation (2) that in the absence of strategic incentives, the planner asks as many agents as possible and makes her allocation decision based on these (truthful) private signals. On the other hand, as we have seen with when agents are strategic, the presence of their incentives introduces a constraint for the planner, which in turn sets an upper bound on the number of agents the planner can truthfully learn from. Under the benchmark mechanism for example, the planner can elicit at most two truthful signals, which may prevent her from taking a correct allocation decision.

Motivated by these shortcomings, we propose a new class of voting mechanisms and examine the incentive-compatible mechanisms within this class. We show that there always exists an incentive-compatible voting mechanism that improves correctness as long as private signals are more informative than the common prior belief; otherwise, none of the voting mechanisms are incentive-compatible and all achieve correctness equal to that of . Theorem 1 is our main result.

Theorem 1.

For any there exists a voting mechanism that is incentive-compatible and improves correctness compared to the sequential offering mechanism . For there is no incentive-compatible voting mechanism and any achieves the same correctness as .

We prove Theorem 1 in several steps. In Section 4.1, we begin by defining the simplest voting mechanisms, the single-batch voting mechanisms, where the mechanism offers the object to one batch only and terminates afterwards. Specifically, we characterize the optimal mechanisms among all incentive-compatible single-batch voting mechanisms. Then in Section 4.2, we utilize these results to develop a simple greedy algorithm to construct an incentive-compatible, correctness-improving voting mechanism with potentially multiple batches.

4.1 Warm up: Optimal Single-batch Voting Mechanisms

As a warm up, we begin with the study of one of the simplest voting mechanisms: the single-batch voting mechanisms. With a slight abuse of notation, we use the following definition.

Definition 1.

Let be the single-batch voting mechanism where the object is offered to only one batch of the initial agents in the queue.

Unlike , the single-batch voting mechanism as well as any other voting mechanism, ensures that the object is allocated if and only if the majority of agents (in the batch) opt in. As our results in this section illustrate, the extent to which this policy incentivizes agents to vote truthfully depends on the values of the prior and batch size .

In Lemma 1, we identify the necessary and sufficient conditions for to be incentive-compatible. In Lemma 2, we show its existence conditional on the informativeness of signals and characterize the correctness it achieves. In Proposition 2, we establish that among all incentive-compatible correctness is maximized when the incentive compatibility constraint is binding at Finally in Lemma 3, we provide simple comparative statics of this optimal batch size. In the next section, we will extend these results to voting mechanisms possibly with multiple batches.

First, for any batch size let

be some interval of prior where Moreover, let (respectively, ) be the maximum (respectively, minimum) batch size such that Then the incentive compatibility of can be characterized in terms of the interval or equivalently, the batch size bounds and :

Lemma 1.

is incentive-compatible if and only if (i) , or equivalently (ii) and

The proof of the lemma can be found in Appendix D together with the rest of the proofs in this section. A sketch of the proof is as follows. To show condition (i), for a fixed , let (respectively, ) be the probability that the object gets allocated to some agent in the batch conditional on the true quality being good (respectively, bad). We compute that

Next, we use and to rewrite the incentive compatibility constraints as and Given that the left-hand sides of both inequalities monotonically increase with there exist some thresholds of prior, namely and , that solve the indifference conditions: and Based on this transformed system of equations, we are able to characterize the feasible region for the prior which turns out to be precisely . To show the equivalent condition (ii), we utilize some useful properties of the interval : first, its endpoints are weakly decreasing in (Lemma 1), and second, it is overlapping (i.e., ) (Lemma 2).

Figure 2: Interval of incentive-compatible prior for different batch sizes as described in Lemma 1. The intervals are decreasing (Lemma 1) with and the consecutive intervals and overlap (Lemma 2) with each other.

Condition (ii) in terms of batch size provides a straightforward intuition. Larger batch sizes make agents more confident that, if the object is allocated, it is likely that the quality is good. For the same reason, however, a batch size that is too large incentivizes everyone to opt in, including those with negative signals. Borrowing the terminology from the voting literature, is designed to make each agent’s decision pivotal: is the maximum batch size that guarantees incentive compatibility. On the other hand, if the batch size is too small, then the allocation decision depends on learning from a sample size that is too small to offer reliable information, making agents with positive signals reluctant to opt in. Thus, is the minimum batch size that ensures incentive compatibility. Furthermore, Lemma 1 gives rise to the following observation.

Lemma 2.

Suppose Then, an incentive-compatible always exists and achieves correctness where

Lemma 2 shows that if each signal is at least as informative as the prior , there is a single-batch voting mechanism that elicits information in a truthful manner. Another implication of the lemma is that the correctness of an incentive-compatible can be expressed as the tail distribution of a random variable. This tail distribution can be interpreted as the probability that the majority of votes are aligned with the true quality. Given the definition of correctness, this property is thus particularly intuitive.

An additional observation is that the tail distribution increases with (see Lemma 2 in Appendix E). Consequently, given the upper size limit maximizes correctness among all possible values of that conserve the incentive compatibility of We formalize this property below.

Proposition 2.

Suppose Then the batch size maximizes correctness among all incentive-compatible

Hence the planner will choose the largest batch size that the agents’ incentives allow. As we establish below, the optimal batch size behaves monotonically with respect to the prior

Lemma 3.

weakly decreases with

The logic here is simple. If the sentiment around the object quality is optimistic (i.e., is high), then it is harder to keep the agents with negative signals truthful, requiring a tighter upper bound on the batch size (lower ). We discuss additional comparative statics through simulations in Section 5.

Finally, note that the results above focused on regimes where . This a natural setting which guarantees that the signal is at least as informative as the prior belief. For the complementary case where , we show that there is no incentive-compatible voting mechanism (see Lemma 3 in Appendix D). Intuitively, the lack of informativeness of the signal induces the agents to lose trust in their private signals, and as a result creates no incentives to behave truthfully. In that case, under any all agents will opt in and thus the object will always be allocated. Notice that this outcome is equivalent to the outcome under in terms of correctness.555The only difference is that under the object is always allocated to an agent in the batch uniformly at random, whereas under it is always allocated to agent .

4.2 Improving Correctness via Greedy Voting Mechanisms

Next we utilize the results of Section 4.1 to characterize how to dynamically choose batch sizes of a voting mechanism (potentially with multiple batches) to improve correctness, while balancing the agents’ incentives. Further, we implement this mechanism via a simple greedy algorithm.

The first key technical observation is that, from the perspective of the agents in batch the voting mechanism is equivalent to a single-batch voting mechanism with realized common belief and batch size Naturally, since both the planner and agents observe the voting results from previous batches, no informational asymmetry arises between them.

Lemma 4.

A voting mechanism is incentive-compatible if and only if for any batch current belief and batch size is incentive-compatible.

Hence, to ensure the incentive compatibility of it is sufficient to choose each batch size myopically by solving a single-batch voting mechanism design problem (with updated belief instead of ). This suggests the following greedy scheme.

Definition 2.

For any let be the voting mechanism that offers the object to batches unless either it is allocated or the list is exhausted. Formally, for each batch and realized belief the batch sizes are chosen as

Building upon Lemma 4 and results from Section 4.1, we establish three key properties of .

Proposition 3.

For any and has the following properties:

  • [noitemsep,topsep=-0.5em]

  • is incentive-compatible;

  • Ex-post batch sizes satisfy for any ;

  • strictly increases with .

To understand part (ii), suppose that the object had been offered to batch but was not allocated. Then, the current belief naturally decreases, that is , as the majority of batch has voted to opt out. By Lemma 3, a more pessimistic belief about the object quality results in a larger optimal batch size Hence, it follows that Part (iii) of the proposition suggests that adding an additional batch improves correctness. The intuition is that an additional batch allows the planner to collect and learn from more information about the object. Consequently, the planner would like to keep on offering until either the object is allocated or there are not enough agents left in the queue. We formally define this mechanism below and describe a simple algorithm to implement it.

Definition 3.

Let be the greedy voting mechanism such that for each batch and the realized belief the batch sizes are chosen as

Initialize batch belief batch size ;
while  do
        Collect votes from the top remaining agents;
        if  then
               Allocate the object uniformly at random among agents in batch who opted in;
               Update belief using Equation (1) and next batch size ;
               Set ;
Algorithm 1 Implementation of

Note that a major advantage of is its design simplicity: the maximum number of batches, does not have to be predefined.

Finally, we are ready to prove Theorem 1. The main idea is that using the correctness of the sequential offering mechanism from Proposition 1, we can show that for every the planner can achieve that exceeds by appropriately setting (full proof in Appendix D.3). Furthermore, using Proposition 3 (part (iii)), we establish that the improvement described in Theorem 1 can be also achieved by

Corollary 1.

For any is an incentive-compatible (multi-batch) voting mechanism that improves correctness in comparison to the sequential offering mechanism .

5 Simulations

In this section we use simulations to complement our theoretical results.

Optimal batch size. In Figure 3, we study the optimal batch size (Proposition 2) of the single-batch voting mechanism ; by construction, the optimal is equivalent to In our numerical analysis, we examine for all priors and signal precision

We make several observations. First, as guaranteed by Lemma 3, the optimal batch size decreases as the prior increases. As we discussed in greater detail in Section 4.1, the incentive compatibility constraint binds at lower values of For very high values of , that is , Theorem 1 establishes that it is impossible to achieve incentive compatibility for any batch size, which explains the discontinuity at in each curve.

Second, for lower values of (approximately for values ), the marginal effect of on is negative but diminishes as grows. Indeed, as Figure 3 illustrates, the signal precision plays an important role for lower values of . In particular, if is low and is not significantly informative (e.g., ), the planner has a priori low confidence in the object’s quality. Similarly, agents have strong incentives to opt out regardless of their signal. Thus, the planner needs a larger batch size.

Figure 3: Via simulations, we compute the optimal incentive-compatible batch size for all possible priors for three regimes: . In all regimes, decreases as increases. For low values of , higher implies lower .
Figure 4: We compare the correctness of mechanisms in a setting with population size represents the optimal mechanism in the absence of strategic incentives, which achieves near perfect correctness close to (see Equation (2)). The remaining , , and assume strategic agents. We compute for all and three different regimes In all regimes, achieves higher correctness than for all . With one additional batch, further improves the correctness of and outperforms for all .

Correctness evaluation and comparison. In Figure 4, we evaluate the correctness of four different mechanisms for and . For , we study the behavior of , , as a function of the prior . We observe several interesting properties. First, and apply two opposite forces on correctness: for both , tends to decrease as increases but for higher ’s this effect is smaller. Intuitively, as the prior grows, the batch size becomes smaller (see Figure 3). As the planner learns from fewer signals, the probability to misallocate the object increases. However, as the signal precision improves, the magnitude of a misallocation decreases since signals become more reliable.

Second, the comparison between and confirms that an additional batch has an important positive effect on correctness. The two-batch mechanism outperforms the single-batch mechanism . In fact, the gap between their achieved correctness grows as increases. Intuitively, if the planner fails to allocate a good object in the first batch, she has another chance to allocate it in the second batch. Since a higher translates to a higher probability that the object is of good quality, having two opportunities to allocate the object has a positive impact on correctness. This also explains the non-monotonic (but still generally decreasing) trend of with respect to , which is especially evident for high values of

Finally, we are interested in comparing the correctness of voting mechanisms against two natural benchmarks. The first benchmark is the sequential offering mechanism (see Section 3) in the presence of strategic incentives. As established in Theorem 1, there exists a voting mechanism (in this case, ) which outperforms for all . However, observe that the number of batches is crucial. In contrast to , achieves higher correctness than only for .

Comparing against , we observe that the performance gap widens in the region as grows. This is because the object is always discarded for (see Figure 1) which further explains why (Proposition 1). For , the gap remains positive but decreases as increases, since the signals of the first two agents affect the outcome of and thus decrease the chance of misallocation. Note again that, due to Theorem 1, for , any voting mechanism achieves correctness equal to .

The second benchmark assumes that agents are not strategic (see Equation (2)) and thus achieves the optimal correctness close to 1. The ratio , , serves as a measure of the price of anarchy. We see that incentives put a significant cost on correctness, especially for higher values of . Higher values of help decrease the price of anarchy. For any , the maximum price of anarchy is observed around and equals in a single-batch mechanism.

6 Discussion and Conclusion

Recent empirical evidence suggests that a significant contributing factor to the high discard rate of donated organs is herding. We propose to amend the current allocation policy by incorporating learning and randomness into the allocation process. We support our recommendations by introducing and analyzing a stylized model through theory and simulations. From a theoretical perspective, our work highlights the tension between learning and incentives and contributes to the understanding of optimal allocation when agents are privately informed and thus susceptible to herding behavior.

Limitations and extensions. For simplicity, we make two limiting assumptions: (1) agent utilities are symmetric; (2) agents differ only in the realization of their private signals. We believe that our results are qualitatively robust and can be extended to a model with variations in agent types and asymmetric utilities. Increasing the relative magnitude of accepting a “bad” object will make agents more cautious, thus leading to larger batches. Introducing heterogeneous agents adds little insight to our qualitative conclusions; therefore, for the sake of brevity, we considered homogeneous agents.

Nevertheless, we acknowledge that our model is stylized with several abstractions that may not fully reflect the real organ allocation markets. In practice, an organ allocation waitlist might have patients of varying medical characteristics, such as age and blood type, which can give rise to heterogeneous utilities from the same organ. Therefore, our current model, which assumes a common utility function, should be interpreted as restricting attention to patients within the same medical characteristics group. It would be an interesting future direction to generalize some of these parts of our model.

Social impact and ethical concerns. Our proposed mechanism is designed to increase the acceptance rate of transplantable donated organs. While designing healthcare policies using insights from theoretical models has been extensively used (see e.g., Ashlagi and Roth (2021, 2014) for kidney exchange), we emphasize that one must pursue such a change in a gradual manner. Furthermore, we note that our mechanism does not constitute a Pareto improvement. Patients, who are first in line, may be skipped due to the randomness in allocation. Thus, although our policy has the potential to improve the overall system efficiency, its implementation must consider any unintended effects on individual patients.

Acknowledgments and Disclosure of Funding

Jamie Kang acknowledges support of the Jerome Kaseberg Doolan Fellowship and Stanford Management Science & Engineering Graduate Fellowship.

Faidra Monachou acknowledges support from the Stanford Data Science Scholarship, Google Women Techmakers Scholarship, Stanford McCoy Family Center for Ethics in Society Graduate Fellowship, Jerome Kaseberg Doolan Fellowship, and A.G. Leventis Foundation Grant.

Moran Koren acknowledges the support of the Israeli Minestry of Science and Technologi’s Shamir scholarship and the support of the Fulbright foundation.


  • D. Acemoglu, A. Ozdaglar, and A. Parandehgheibi (2010) Spread of Misinformation in Social Networks. Games and Economic Behavior. External Links: arXiv:0906.5007v1 Cited by: Appendix A, §1.
  • I. Arieli, M. Koren, and R. Smorodinsky (2018) The One-Shot Crowdfunding Game. In EC ’18: Proceedings of the 2018 ACM Conference on Economics and Computation, pp. 213–214. Cited by: Appendix A, §1.
  • I. Arieli and M. Mueller-Frank (2019) Multidimensional Social Learning. The Review of Economic Studies 86 (3), pp. 913–940. External Links: Document, Link Cited by: Appendix A, §1.
  • I. Ashlagi and A. E. Roth (2014) Free Riding and Participation in Large Scale, Multi-hospital Kidney Exchange. Theoretical Economics 9 (3), pp. 817–863. Cited by: §6.
  • I. Ashlagi and A. E. Roth (2021) Kidney Exchange: An Operations Perspective. Technical report National Bureau of Economic Research. Cited by: §6.
  • D. Austen-Smith and J. S. Banks (1996) Information Aggregation, Rationality, and the Condorcet Jury Theorem. The American Political Science Review 90 (1), pp. pp. 34–45. External Links: ISSN 00030554, Link Cited by: Appendix A, §1.
  • A. V. Banerjee (1992) A Simple Model of Herd Behavior. The Quarterly Journal of Economics 107 (3), pp. 797–817. Cited by: Appendix A, §1, footnote 2.
  • S. Bikhchandani, D. Hirshleifer, and I. Welch (1992) A Theory of Fads, Fashion, Custom, and Cultural Change as Informational Cascades. Journal of Political Economy 100 (5), pp. 992–1026. Cited by: Appendix A, §1, footnote 2.
  • A. Butler, G. Chapman, J. N. Johnson, A. Amodeo, J. Böhmer, M. Camino, R. R. Davies, A. I. Dipchand, J. Godown, O. Miera, et al. (2020) Behavioral economics—a framework for donor organ decision-making in pediatric heart transplantation. Pediatric transplantation 24 (3), pp. e13655. Cited by: Appendix A.
  • Y. Che and J. Horner (2015) Optimal Design for Social Learning. Cited by: Appendix A, §1.
  • M. Cooper, R. Formica, J. Friedewald, R. Hirose, K. O’Connor, S. Mohan, J. Schold, D. Axelrod, and S. Pastan (2019) Report of National Kidney Foundation Consensus Conference to Decrease Kidney Discards. Clinical Transplantation 33 (1), pp. e13419. Cited by: Appendix A, §1.
  • N. De Condorcet (1785) Essay sur l’application de l’analyse a la probabilite des decisions rendues a la pluralite des voix. Par m. le marquis De Condorcet... de l’imprimerie royale. Cited by: Appendix A, §1.
  • S. De Mel, K. Munshi, S. Reiche, and H. Sabourian (2020) Herding in Quality Assessment: An Application to Organ Transplantation. Technical report IFS Working Papers. Cited by: Appendix A, §1, §1.
  • E. Eyster, A. Galeotti, N. Kartik, and M. Rabin (2013) Congested Observational Learning. Games and Economic Behavior 87 (283454), pp. 1–32. External Links: Document, Link Cited by: Appendix A, §1.
  • J. Glazer, I. Kremer, and M. Perry (2021) The Wisdom of the Crowd When Acquiring Information Is Costly. Management Science. Cited by: Appendix A, §1.
  • N. Immorlica, J. Mao, A. Slivkins, and Z. S. Wu (2018) Incentivizing Exploration with Unbiased Histories. arXiv preprint arXiv:1811.06026. Cited by: Appendix A, §1.
  • N. Immorlica, J. Mao, A. Slivkins, and Z. S. Wu (2019) Bayesian Exploration with Heterogeneous Agents. In The World Wide Web Conference, pp. 751–761. Cited by: Appendix A, §1.
  • E. Kamenica and M. Gentzkow (2011) Bayesian Persuasion. American Economic Review 101 (6), pp. 2590–2615. Cited by: Appendix A, §1.
  • A. Kolotilin, T. Mylovanov, A. Zapechelnyuk, and M. Li (2017) Persuasion of a Privately Informed Receiver. Econometrica 85 (6), pp. 1949–1964. Cited by: Appendix A.
  • I. Kremer, Y. Mansour, and M. Perry (2014) Implementing the “Wisdom of the Crowd”. Journal of Political Economy 122 (5), pp. 988–1012. Cited by: Appendix A, §1.
  • M. A. Mankowski, M. Kosztowski, S. Raghavan, J. M. Garonzik-Wang, D. Axelrod, D. L. Segev, and S. E. Gentry (2019) Accelerating kidney allocation: simultaneously expiring offers. American Journal of Transplantation 19 (11), pp. 3071–3078. Cited by: footnote 1.
  • Y. Mansour, A. Slivkins, V. Syrgkanis, and Z. S. Wu (2016) Bayesian Exploration: Incentivizing Exploration in Bayesian Games. arXiv preprint arXiv:1602.07570. Cited by: Appendix A, §1.
  • S. Mohan, M. C. Chiles, R. E. Patzer, S. O. Pastan, S. A. Husain, D. J. Carpenter, G. K. Dube, R. J. Crew, L. E. Ratner, and D. J. Cohen (2018) Factors Leading to the Discard of Deceased Donor Kidneys in the United States. Kidney International 94 (1), pp. 187–198. Cited by: §1.
  • E. Mossel, A. Sly, and O. Tamuz (2015) Strategic Learning and the Topology of Social Networks. Econometrica 83 (5), pp. 1755–1794. External Links: Document, Link Cited by: Appendix A, §1.
  • National Kidney Foundation (2016) Organ Donation and Transplantation Statistics. External Links: Link Cited by: §1.
  • Y. Papanastasiou, K. Bimpikis, and N. Savva (2017) Crowdsourcing Exploration. Management Science 64 (4), pp. 1727–1746. Cited by: Appendix A, §1.
  • L. Rayo and I. Segal (2010) Optimal Information Disclosure. Journal of political Economy 118 (5), pp. 949–987. Cited by: Appendix A, §1.
  • L. Smith and P. Sørensen (2000) Pathological Outcomes of Observational Learning. Econometrica 68 (2), pp. 371–398. Cited by: Appendix A, §1, footnote 2.
  • D. E. Stewart, V. C. Garcia, J. D. Rosendale, D. K. Klassen, and B. J. Carrico (2017) Diagnosing the Decades-long Rise in the Deceased Donor Kidney Discard Rate in the United States. Transplantation 101 (3), pp. 575–587. Cited by: §1.
  • U.S. Department of Health and Human Services (2021) Organ Procurement and Transplantation Network: National Data. External Links: Link Cited by: §1.
  • J. Zhang (2010) The Sound of Silence: Observational Learning in the U.S. Kidney Market. Marketing Science 29 (2). External Links: Document Cited by: Appendix A, §1, §1.

Appendix A Extended Related Literature

The information structure of the model is based on the literature on observational learning. The seminal papers by Banerjee [1992], Bikhchandani et al. [1992] show that if agents receive binary signals over the object’s quality and observe previous agents’ action history, herding (information cascades) will eventually take place and information will be not be aggregated. These findings are extended by Smith and Sørensen [2000] to general distributions and follow-up works examine the robustness of these results in various game structures [Acemoglu et al., 2010, Eyster et al., 2013, Mossel et al., 2015, Arieli and Mueller-Frank, 2019]. Note that none of these works consider the existence of a planner that is able to control the flow of information.

Several empirical studies examine the presence of observational learning in real-world scenarios [Zhang, 2010, De Mel et al., 2020]. Using data from organ transplantation in the United Kingdom, De Mel et al. [2020] develop empirical tests to detect herding behavior and quantify its welfare consequences. Zhang [2010] provides empirical evidence for herding behavior in the United States deceased donor waitlist. Moreover, there also have been growing clinical literature on behavioral factors in organ allocation such as Butler et al. [2020], Cooper et al. [2019].

To improve the performance of the allocation system, we propose utilizing a voting mechanism to elicit agents private information. The ability of voting procedures to uncover ground truth by aggregating dispersed information dates back to the seminal Condorcet Jury Theorem [De Condorcet, 1785]

, which utilizes the law of large numbers to assert that majority voting will reach the correct decision, provided that the population is large enough and agents are not strategic.

Austen-Smith and Banks [1996] show that Condorcet’s result does not hold when agents are strategic and are allowed to deviate from their private information. Our notion of correctness is taken from Arieli et al. [2018]. One novel feature in our model is that the number of votes is determined endogenously to maximize the probability of making the correct choice.

Our work is also loosely connected to the literature on information design (e.g., [Kamenica and Gentzkow, 2011, Kolotilin et al., 2017, Rayo and Segal, 2010, Papanastasiou et al., 2017, Che and Horner, 2015]) and Bayesian exploration (e.g., [Kremer et al., 2014, Glazer et al., 2021, Immorlica et al., 2018, Mansour et al., 2016, Immorlica et al., 2019]). With the exception of Glazer et al. [2021], who consider an online recommendation problem with costly information acquisition, most of these works do not take into account the possibility that agents are privately informed. Instead, several papers such as Kolotilin et al. [2017], Immorlica et al. [2019] consider agents with private information in the form of types or idiosyncratic preferences. In contrast to our model where private signals are informative about the true quality of the object, those types are not correlated with the true state variable.666I.e., in our case, knowing the value of private signals does lead the decision maker to update her belief accordingly to this information.

Appendix B Optimal Solution in the Absence of Strategic Incentives

Lemma 1.

Suppose that agents are not strategic and voluntarily reveal their private signal value to the planner. Let

Then, the planner achieves optimal correctness when she allocates the object if and only if .


Conditional on , the number of positive signals follows a binomial distribution, i.e., . Conditional on , the number of positive signals follows a binomial distribution, i.e., .

Let denote the number of minimal positive signals required to allocate the object. For any number of positive signals higher than , it is also optimal to allocate the object since the posterior belief that strictly increases.

The threshold is the smallest integer that satisfies

Recall that . Taking the logarithm of the previous relation and rearranging the terms, we finally get that

Appendix C Omitted Analysis of the Sequential Offering Mechanism (Section 3)

We first analyze the optimal strategies of the first three agents in the queue. Before taking any action, each agent observes his position in the queue in addition to his private signal and the common object prior . In particular, by observing his position , agent recognizes that the object is being offered to him because all of the preceding agents have rejected it. This observation contributes to his posterior belief about the object quality.

Agent . Agent updates his posterior belief using Bayes’ law based only on his signal

Since he receives utility if and if his expected utility to opt in is On the other hand, his utility to opt out is We observe that agent is better off opting in if

which corresponds to

Let be agent ’s optimal action in this mechanism. Then the following lemma characterizes .

Lemma 1.

Under the sequential offering mechanism, agent chooses action

Lemma 1 shows that the sequential offering mechanism drives the first agent to follow his signal if it is more informative than the common prior (i.e., when ). In the opposite case, where the prior is more informative (i.e., when either or ), the agent ignores his private signal. In particular, for high priors the object is always allocated to agent regardless of its true quality .

Agent . When the second agent is offered the object, he knows that the first agent has already opted out. At the same time, agent 2 is aware of agent 1’s optimal strategy as described in Lemma 1. Therefore, if the prior satisfies then agent 2 infers that agent 1 has followed his own signal As such, for , agent 2 updates his posterior belief informed by this observation as follows:

On the other hand, for , it holds that

in which case agent 1’s action is uninformative to agent 2.

Lemma 2.

Under the sequential offering mechanism, agent chooses777In the sequential offering mechanism, if then the object is never offered to agent because agent 1 always opts in.


Similarly to agent 1, agent 2 prefers to opt in if


For and Equation (3) holds when For and Equation (3) corresponds to

However, for any we have

which makes Equation (3) infeasible.

For notice that

Thus Equation (3) is again infeasible regardless of

For exactly, by our assumption, he simply rejects the object. Therefore, agent 2 prefers to opt in if and Otherwise, he prefers to opt out. ∎

Lemma 2 shows that agent 2 follows his signal if whereas he ignores the signal and opts out if The characterizations of and offer implications that will be useful to analyze the optimal strategies of subsequent agents. First, for , Lemma 1 and Lemma 2 together suggest that both agent 1 and agent 2 follow their signals. Therefore, the availability of the object after being offered to agent 2 implies to subsequent agents that both and Second, for agent 2’s action is uninformative about . Agent 1’s action, however, can be informative depending on the value of If then the subsequent agents can infer that because agent 1 follows his signal. If however, agent 1’s action is also uninformative.

Notice here that, whenever agent ’s or ’s opt-out action is informative, his private signal (that the subsequent agents can perfectly infer) is We use this implication for our analysis next.

Agent . If the object is offered to agent 3, one of the three events must have occurred: (i) and , or (ii) and , or (iii) Nonetheless, we claim that in any case, agent 3 would always choose to opt out.

Lemma 3.

Under the sequential offering mechanism, agent 3 always chooses to opt out:


Agent 3 will prefer to opt in if


For agent 3’s posterior belief given her signal is

For , we have

The proof is analogous for . As a result, for agent 3 prefers to opt out regardless of his own signal.

For agent 3 infers that Agent 3’s posterior belief equals

Then Equation (4) is infeasible because

Finally, for

where we have

making Equation (4) infeasible.

Now we extend Lemma 3 to any remaining agent in the queue. Let denote the optimal action of agent

Lemma 4.

In the sequential offering mechanism, any agent other than the first two always opts-out. That is, for any


We use mathematical induction on to prove the statement. For the basis , we have that due to Lemma 3 agent 3 always opts out thus his action is always uninformative to the subsequent agents. Next, as the inductive step, consider some agent Suppose that agent updates his posterior belief in some manner such that it is optimal for him to opt out regardless of Because this action is uninformative, the next agent must update his posterior belief the same way as agent Therefore, it is also optimal for agent to opt out. ∎

Another way to interpret Lemma 4 is that if neither agent 1 nor agent 2 opts in, then the object will be discarded. Using this result along with Lemma 1 and Lemma 2 we characterize the outcome of the sequential offering mechanism (Lemma 1 in Section 3):

See 1


The proof follows directly from combining Lemmas 1 to 4. ∎

See 1