There are many settings in which a decision-maker is faced with a difficult problem that they cannot solve on their own, and so they instead approach it in two steps: they first delegate the search for possible solutions to an agent who is able to invest more time in the process, and then they evaluate the solution(s) that the agent proposes. One concrete example arises in organizations or firms, where the management may delegate the search for solutions to a division that reports to them, ultimately making a decision on the solution that is proposed by the division (Aghion and Tirole, 1997; Armstrong and Vickers, 2010; Li et al., 2017). A second example arises in regulation, where a governmental agency needs to decide whether there is a way to structure a proposed corporate merger in a way that is compatible with regulatory guidelines; the companies seeking to merge study possible ways of structuring the merger, propose one or more to the regulator, and seek the regulator’s approval. In this way, the search over structures for the merger has implicitly been delegated from the regulator to the companies, with the regulator retaining decision-making control over the the proposed solution (Farrell and Katz, 2006; Nocke and Whinston, 2010). A similar scenario could be described for regulation in other settings, where a company may be searching over possible solutions that comply with environmental law, employment law, or other guidelines.
The interesting tension in all these situations is that the decision-maker who delegates the task (henceforth referred to as the “principal”) has a particular objective function that they are seeking to optimize; but the agent who actually performs the task might have interests that are not directly aligned with the principal’s. For example, in the regulatory context, the regulator (acting as principal) may reasonably suspect that a merger proposed by a set of companies (acting as the agent) will be structured in a way that strongly benefits the companies, even if other feasible structures would have been better for the market or for society as a whole. Similarly, a group within an organization tasked with solving a problem may well preferentially search for a solution that benefits them in particular. Given this natural set of incentives, how should the principal structure the delegation to the agent so as to ensure that the solution the agent proposes performs well under the principal’s own objective function?
A rich literature has developed in economics around the formalization and analysis of delegation, focusing on this tension between the conflicting objectives of the principal and the agent; see Holmstrom (1977, 1984) for influential early research, and Alonso and Matouschek (2008); Armstrong and Vickers (2010); Amador and Bagwell (2013); Ambrus and Egorov (2017) for recent work. A dominant theme in this line of work is that the principal does not offer monetary compensation to the agent as a way of favoring certain proposed solutions over others (though see Krishna and Morgan (2008)); this is consistent with the motivating applications, in which for example regulators in many contexts can accept or reject proposals from companies, but cannot selectively offer varying amounts of compensation to these companies based on the content of the proposal. This lack of monetary transfers between the parties imparts a fundamental structure to the problem, in which the principal can simply define a mechanism implicitly specifying the subset of all “eligible” proposals that they are willing to accept, and the agent is then incentivized to search for solutions that are good for them but that also lie in this eligible set. A long line of work has gone into determining the structure of eligible sets that produce optimal mechanisms for the principal, yielding constructions that are often quite intricate (Alonso and Matouschek, 2008; Armstrong and Vickers, 2010; Melumad and Shibano, 1991).
The Present Work
Given how broadly delegation is used across a range of contexts, it is interesting to consider how precarious a process it is — the principal is ceding control of their search problem to an agent whose interests might be completely misaligned with their own, and the only leverage the principal has is to accept or reject the solution that is eventually proposed. How much does the principal give up — quantitatively, measured in terms of the objective they are trying to optimize — when they delegate to an agent? Is there some robust, intrinsic reason why things don’t turn out as badly as we might fear? And how do the answers to these questions depend on what the principal is actually able to observe about the agent’s solution — including how much effort the agent put into finding the solution, and how good it is not just for the principal but also for the agent?
In their most natural formulation, these are inherently comparative questions, since they seek to relate the solution quality obtained through delegation to the solution quality in an alternate, ideal setting where delegation was not necessary. As such, they address an issue fundamentally distinct from the primary focus of the existing literature on delegation, which as noted above has centered on characterizing mechanisms that produce optimal delegation for the principal, without this type of comparative evaluation.
There is a natural benchmark to use for our comparison: we could measure the quality of the outcome under delegation versus the quality of the solution that the principal could obtain were they to perform the search task themself, investing the same level of effort in the search that the agent does. Now, there are many settings where it may be too costly or otherwise infeasible for the principal to actually conduct this search, but this benchmark nonetheless provides a conceptual reference point to make clear how much payoff to the principal is lost through delegation. In this sense, it plays the role of an optimal point of comparison, much like the role of the intractable optimum in an approximation algorithm or the societal optimum in a price-of-anarchy analysis.
In this paper we develop a methodology to bound the performance of delegated search, relative to the benchmark in which the principal searches for a solution on their own. Our methodology builds on a set of links that we identify between bounds on delegated search and the analysis of some fundamental models of decision-making under uncertainty — in particular, a surprisingly strong connection between delegated search and bounds on prophet inequalities. The connections between these formalisms turn out be quite natural and useful, but to our knowledge they have not been previously identified in either of the literatures on delegated search or on prophet inequalities. This connection not only provides bounds on the quality of delegated solutions relative to an ideal benchmark; it also shows that strong bounds can be obtained using eligible sets that are structurally very simple — in a number of cases defined simply by a carefully chosen threshold rule — and hence in contrast with the complex constructions associated with optimal mechanisms.
1.1 Overview of Models
A Distributional Model
We begin by describing the models in which we perform our analysis. Our main model, which is essentially the one considered in Armstrong and Vickers (2010), has the principal and the agent agree that the agent will consider candidate solutions and propose one to the principal; thus represents the level of effort that the agent commits to the problem. The principal will only see the solution that is proposed, not the other that the agent also considers.
What does it means for the agent to consider a candidate solution? We assume that the solutions belong to an abstract space
with a probability measure on it, and the agent’s search for a solution consists of performingindependent and identically distributed draws from , resulting in a set of candidate solutions .222Later we will also consider the case in which different draws by the agent can come from different probability measures on ; for example, this can model the case in which the agent is a group of employees in an organization, and the solution is drawn by the agent, who may have a different distribution over solutions from the agent. Each solution drawn by the agent has a quality for the principal, denoted , and a possibly different quality for the agent, denoted . The agent selects one of its candidate solutions, say , to present to the principal. (Below, we will discuss the contrast between the model in which the principal is able to determine both and — the value to both the agent and themself — for the proposed solution, and the model in which the principal is only able to determine .)
Now, if the principal imposed no constraint on the agent’s behavior, then the agent would simply choose the solution that maximizes , and the principal would receive whatever corresponding value resulted from this choice. To improve on this, the principal could specify at the outset that they will only accept values that satisfy some predicate on and (in the case that they can determine it) ; we will refer to the set of all satisfying the principal’s predicate as the eligible set of solutions. It is thus in the agent’s interest to propose a solution belonging to the eligible set; we ask whether one can design eligible sets that provide provable bounds on the expected quality of the solution to the principal, relative to the scenario in which the principal simply were to draw times from and select the sampled solution with maximum .
A Binary Model
In our first, distributional model, the agent draws a set of candidate solutions that the principal cannot observe, and then must choose one to present to the principal. This models a setting in which the agent explores a design space and cannot fully anticipate which it will encounter until it begins this exploration. But we can also imagine settings where the principal and agent both know that the set of potential options comes from a large discrete set , and the only question is which of these options is actually feasible to implement. For example, there may be standard ways of structuring a merger in a given industry, and the only question is which are possible for the companies in question.
We model this version with publicly known binary options as follows. There is a set of options , and for each , there is a known probability such that option is feasible with probability , and infeasible otherwise. If option turns out to be feasible, then it produces a known payoff of for the principal and a known payoff for the agent; if it turns out to be infeasible, then it produces a payoff of for both. The only way to evaluate the feasibility of option is to pay a cost of to investigate it.
The principal delegates to the agent the task of proposing a feasible option, which the principal can either accept or reject. The principal will not be able to see which options the agent decides to pay to evaluate as part of this task, but again the principal can specify a predicate defining the eligible subset of that they will accept. Subject to this constraint, the agent then must decide how to evaluate options in a way that maximizes its own benefit from the option it proposes, minus the evaluation cost. Here too we evaluate the principal’s payoff relative to the scenario in which they performed the evaluation of options themself. We also consider an extension of this model in which there is a budget of on the number of options that the agent can evaluate — a constraint analogous to the bound on the number of samples the agent can evaluate in our first distributional model.
1.2 Overview of Results
We begin by showing that for an arbitrary instance of the distributional model, there is a mechanism the principal can specify to the agent so that the principal’s expected payoff from delegation is within a factor of of the expected payoff they’d receive were they able to search for the solution by themself. (If the principal were to search by themself, they would examine candidate solutions and choose the one that was best for them.) This mechanism only requires knowledge of the principal’s values, not the agent’s values, and it has a very simple structure: depending on the distribution of values, it can be written as a threshold rule with either a weak threshold, in which the principal only accepts proposals for which for some , or a strict threshold, in which the principal only accepts proposals for which for some . In the case when and are distributed independently with no point masses, the factor of in this bound can be improved to .
There are several things worth remarking on about this result. First, the fact that arbitrary instances of the problem have mechanisms providing provable guarantees of this form suggests a qualitative argument for the robustness of delegation: no matter how misaligned the agent’s interests are, the principal can ensure an absolute bound on how much is lost in the quality of the solution. Second, the mechanism that achieves this bound is very simple and detail-free, consisting of just a (weak or strict) threshold on the quality of the solution for the principal. And third, the mechanism requires knowledge of (the number of samples drawn by the agents) but not the values . In this sense, it suggests that it is more important for the principal to know how much effort the agent has spent on the search (via ) than to know how good the proposed solution is for the agent (via ).
A connection to prophet inequalities
These results on threshold mechanisms and their guarantees follow from a general result at the heart of our analysis — a close connection between bounds for delegated search and prophet inequalities. Prophet inequalities are guarantees for the following type of decision under uncertainty: we see a sequence of values in order, with drawn from a distribution , and when we see the value we must irrevocably decide whether to stop and accept , or continue (in the hope of finding a better value in the future). Research on prophet inequalities has established the non-trivial fact that it is possible to design rules whose expected payoff comes within a constant fraction of the maximum achievable by a decision-maker who could see all the realized values in advance.
Prophet inequalities tend to be established by designing carefully constructed threshold rules, in which the decision-maker accepts if and only if (weakly or strictly) exceeds a specified threshold that can depend on the position . The key component of our analysis is to establish a close, though subtle, technical connection between delegated search and prophet inequalities: roughly speaking, the sequence of values sampled by the agent from the set of possible solutions plays the role of the process generating ; and the principal and the agent jointly — through the principal’s specification of the threshold and the agent’s incentive to obey it — play the role of the decision-maker who uses a threshold rule for deciding when to stop. Again, the notion of “stopping” here is a bit oblique, since the principal never sees the full sequence that the agent generates; this is the sense in which the stopping rule is jointly constructed by the behavior of the principal and the agent together.
Stronger bounds for independent values
Using this connection to threshold rules for prophet inequalities, we can design a much more powerful policy for the principal in the case when the values of and on a draw from are distributed independently, and when the principal can see both and (rather than only ) in the solution proposed by the agent.
To do this, we begin with a stopping rule from the prophet inequality literature achieving an expected payoff that is at least times the optimum when the distributions of the values are independent and identically distributed (Abolhassani et al., 2017; Correa et al., 2017; Hill and Kertz, 1982; Kertz, 1986). This stopping rule uses a sequence of thresholds that decrease with , making the decision-maker naturally more prone to stop and accept a value as the end of the sequence nears — effectively following the idea that one should only accept a value early if it’s very good.
In the context of delegated search when the principal can observe both and for a proposed solution , a related concept is useful for designing mechanisms: the principal should only accept a solution with very large if is very large as well. The analogy between requiring strong incentive to accept a value with large in delegated search and requiring strong incentive to accept a value early in the sequence in the prophet inequality context can be made precise, and it reveals that the values (over the set of candidate solutions considered by the agent) can be used as a kind of “continuous time” parameter for deriving a threshold: if we think of the candidate solution as arriving at continuous time , then we can derive a threshold function in which the principal only accepts if (weakly or strictly) exceeds .333We observe that this continuous time defined by the runs “in reverse,” in the sense that large values of , like small values of time, place stricter demands on the values that can be accepted. In this sense, the principal and agent again jointly construct the stopping rule, with the agent’s payoff providing a type of synthetic temporal ordering that is useful in formulating a threshold policy.
Bounds for the binary model
We also use the connection to prophet inequalities to derive bounds for the binary model, where the agent pays to evaluate the feasibility of pairs from a known list of options . Here too the principal can designate a predefined eligible set of proposals so that the mechanism that accepts any eligible proposed yields an expected payoff that is within a factor of two of the benchmark in which the principal performs the search on their own. However, the eligibility criterion in this case is subtler: it depends not only on the principal’s assessment of the proposal’s quality, , but also on the cost and the a priori probability of feasibility, .
To establish this bound, we draw on both prophet inequality bounds and on work of Kleinberg et al. (2016) for the box problem (Weitzman, 1979); by considering an ordering of the options by the notion of reservation price (or, equivalently, strike price) defined in those works, we can establish a provable guarantee that correctly handles not only the payoff arising from the and values but also the cost incurred by the agent in evaluating the feasibility of options.
Finally, we derive similar bounds in the more general case where the agent also has a budget of on the number of options they can evaluate. The approach using reservation prices does not directly extend to this case, but we show that by combining the approach of Kleinberg et al. (2016) with bounds for stochastic optimization due to Asadpour and Nazerzadeh (2016), we can obtain more general bounds for a budgeted variant of the box problem that contains the case we need for our delegated search guarantee.
We note that it would be a natural open question to consider a variant of the problem combining characteristics of the two main versions we consider here: as in the distributional model, the agent performs independent draws from a space ; but as in the binary model, the agent does not have a fixed bound on the number of allowed draws, instead incurring a cost to perform each draw that must be traded off against the eventual payoff from the sample selected.
1.3 Further Discussion of Related Work on Delegation
The theory of delegation in the economics literature is often viewed as beginning with Bengt Holmstrom’s Ph.D. thesis (Holmstrom, 1977, 1984); this work articulates the basic tension that we see in these models, between allowing an agent to optimize in a large space and restricting the agent’s freedom of action to prevent them from pursuing their own objectives too aggressively. Holmstrom’s model considered delegating an optimization problem over an interval, and a sequence of subsequent papers analyzed the case in which the agent optimizes over a continuum (Alonso and Matouschek, 2008; Melumad and Shibano, 1991). Armstrong and Vickers (2010) propose a model that is very close to what we consider here, where the optimization takes place over a discrete set that the agent samples from an underlying distribution. By way of comparison between our work and that of Armstrong and Vickers (2010), we noted the key contrast earlier in this section: their paper is largely devoted to identifying cases of the delegated search problem for which the structure of the optimal mechanism can be identified, whereas we focus on bounding the inefficiency of delegated search relative to a benchmark in which the principal performs the task themself. It is through our emphasis on these types of bounds that we develop the connection to the analysis of prophet inequalities.
A distinct line of work in delegation relaxes the constraint that the principal may only allow or forbid each proposed solution, and instead allows the principal to add arbitrary amounts of cost to certain subsets of proposed solutions (Athey et al., 2004; Amador and Bagwell, 2013; Ambrus and Egorov, 2017). One of the key motivations for such a condition is to model the strategic role of bureaucracy within an organization: if management wants to dissuade units within the organization from proposing certain types of solutions, they can use bureaucratic measures (requiring more extensive justifications and processes) that make these solutions selectively more costly without explicitly forbidding them, and without engaging in explicit monetary transfers. Ambrus and Egorov (2017) propose a model in which such selective cost increases are in fact part of the optimal delegation scheme.
Finally, a recent working paper by Khodabakhsh et al. (2018) studies algorithmic delegation in a much more general setting, in which a principal must choose an action and she delegates this choice to an agent who is informed of the state of the world. The principal’s and agent’s preferences over actions in every state of the world may differ, leading to a problem of designating a set of eligible actions among which the agent may choose, so as to maximize the principal’s utility when the agent chooses selfishly among the eligible actions. This problem is shown to be computationally hard in general, but a simple threshold policy is shown to achieve a 2-approximation to the optimal mechanism under a “negative bias” condition. Such a result is in the same spirit as our 2-approximation result for threshold rules, though the hypothesis under which their result holds, and the method of proof, are quite different.
2 Model and Preliminaries
We begin by making the precise the way in which the principal and the agent interact, resulting in the principal’s selection of (at most) one element from a set of potential solutions. There are functions and such that if is selected, then the principal’s utility is and the agent’s utility is . To formalize the possibility that the principal selects no solution (i.e., perpetuating the status quo) we identify this possiblity with a special “null outcome”, denoted , and we extend the utility functions from to by setting .
The set is a probability space, with probability measure , and the agent has the power to draw independent samples from according to . The principal, on the other hand, can neither draw samples from nor directly observe the outcome of the agent’s sampling; she must rely on her interaction with the agent to arrive at a selected element of .444Throughout this paper we use feminine pronouns for the principal and masculine pronouns for the agents.
Before formalizing our model of interaction, it is useful to first note some of the ways in which our basic model can be generalized or specialized.
We will initially consider the case of a single probability measure on , but it is also useful to consider cases in which there are multiple probability measures on , and the agent has the power to draw independent samples from any of these distributions.
We will generally assume there is a sampling budget of on the number of samples that the agent can draw. In some of our models, we will also introduce a sampling cost for each draw by the agent — or in the case of multiple probability measures, a cost for sampling from .
We consider both the full-information case — in which the principal knows both the functions and , and hence can evaluate the utility of a solution to both herself and to the agent — and the limited-information case, in which the principal only knows her own utility function .
The functions and
define random variables on, and we consider both the case in which they can be arbitrary non-negative functions, and the case of independent utilities, when they are independent random variables.
In a later section, we will specialize the formalism to the binary model discussed in Section 1, in which each distribution is supported on a two-element set such that . In this case we will let denote the pair . The binary model captures a setting in which the feasibility of the solution is unknown until the agent investigates it, but the value of the solution to both parties (if feasible) is known a priori.
2.1 A General Definition of Mechanisms for the Principal and Agent
Let us now formalize how the principal and the agent interact, resulting in the principal’s selection of a solution. Thus far, our discussion in Section 1 has focused on interactions of a very structured form: the agent draws a set of samples from ; the agent selects one of these samples to present to the principal; and the principal accepts or rejects it. But it would be useful to be able to consider more general formulations for their allowed interactions, within which the transmission of a single proposal from the agent to the principal is a particular special case.
To do this, we begin by defining a mechanism as follows. A mechanism defines a set of signals, , that the agent may send, and an allocation function that specifies which solution the principal will choose given the agent’s signal. In such a mechanism, a strategy for the agent is specified by a mapping , where denotes the set of finite sequences over , such that represents the signal the agent sends if he sampled solutions and observed the sequence .
Suppose the agent observes sequence and sends signal , resulting in outcome . In this case, the principal’s and agent’s utilities are and , respectively, if . Otherwise the principal’s utility is 0 and the agent’s is . In other words, we assume that if the mechanism results in the principal selecting a solution that was never sampled by the agent, that solution cannot be adopted. Instead the status quo is preserved and the agent suffers a penalty. This assumption is consistent with our assumption that the principal lacks the power to directly search for solutions herself; she can only adopt solutions that the agent has discovered.
In models with costly sampling, the specification of an agent’s strategy must also include a sequential policy for deciding which sample (if any) to observe next, given the set of samples already observed. The principal’s and agent’s utilities are both diminished by the sum of costs for the samples that the agent observed when running policy . (We deduct this sum from the principal’s utility because we think of the cost incurred by the agent in searching for a solution as a kind of “waste” that the principal views as detracting from the overall utility.)
Under our definition of mechanisms, the sequence of solutions sampled by the agent leads to a signal (via the agent’s strategy ), and this signal leads to a solution in (via the principal’s allocation function ). Composing these two functions, we get a mapping from the agent’s sampled solutions to a single solution:
Definition 1 (interim allocation function).
If is a mechanism and is an agent’s strategy, the interim allocation function of the pair is the mapping obtained by composing the strategy with the allocation function . In other words, is the outcome resulting from mechanism , when the agent draws sample sequence and plays according to .
2.2 Single Proposal Mechanisms
We now show that there is a sense in which it is without loss of generality to focus on interactions in which the agent proposes a single solution from among the ones they sampled, and the principal either accepts or rejects it. To do this, we define a simple type of mechanism called a single proposal mechanism, and we show in Lemma 1 below that any other mechanism can be simulated by a single proposal mechanism.
A single-proposal mechanism with eligible set is a mechanism in which the agent proposes one outcome, and the mechanism accepts this proposal if and only if it belongs to . More formally, is a single-proposal mechanism with eligible set if and restricts to the identity function on and the constant function on .
If is any mechanism and is any strategy constituting a best response to , then there exists a single proposal mechanism and a best response to , such that the interim allocation functions and are identical.
Let be the range of the interim allocation function , i.e. the set of all possible outcomes of , other than , when the agent acts according to . Define to be the single-proposal mechanism with eligible set . Let be the strategy in which the agent observes his tuple of samples, , and chooses strategy . By construction the interim allocation functions and are identical. To prove that is a best response to , consider any and any . Let denote the agent’s utility when playing according to ; note that . We wish to show that the agent cannot benefit by playing instead, i.e.
If then which implies (1) since . If then for some . Now (1) follows because strategy is a best response for mechanism , and denotes the agent’s utility when playing strategy in whereas denotes his utility when playing . ∎
3 Analyzing Delegated Search Via Prophet Inequalities
In this section we develop a formal link between delegated search mechanisms and prophet inequalities. It turns out that the relevant prophet inequalities involve random variables arriving at discrete points in continuous time, rather than the usual assumption that they arrive at time points . Accordingly, we will begin by explaining the formal model of continuous-time prophet inequalities in Section 3.1 below. Then in Section 3.2 we explain the reduction from delegated search (in the distributional model) to continuous-time prophet inequalities.
3.1 Continuous-time prophet inequalities
In this section we will be concerned with problems which involve designing a selection rule to choose (at most) one element from a random finite set of pairs , with the goal of maximizing the expected -coordinate of the chosen element. The -coordinate is thought of as a time coordinate, and we will generally (but not exclusively) be concerned with selection rules that make their choice without looking into the future, as is ordinarily the case in the analysis of prophet inequalities.
Definition 3 (selection rules).
A selection rule is a function from finite subsets of to the set , with the property that for every .
A stopping rule is a selection rule that chooses element from set without looking at the set of elements whose time coordinate is greater than . Formally, is a stopping rule if it satisfies the following property: for any and any two sets such that , we have
An oblivious stopping rule with eligible set is a stopping rule such that for every , is an earliest element of (i.e., an element of that set with minimum coordinate) or if is empty.
A threshold stopping rule with threshold is an oblivious stopping rule whose eligible set is of the form or .
Definition 4 (CTSPs and prophet inequalities).
A continuous-time selection problem
(CTSP) is an ordered pairwhere
is a set of probability distributions over finite subsets of, and is a set of selection rules.
A CTSP satisfies a prophet inequality with factor if it is the case that for every there exists some such that
Here the random variable is defined by specifying that if then , and if then . The random variable is defined to be .
We now present the prophet inequalities we will use in this work. To state them, we define the following families of stopping rules and distributions on subsets of .
is the family of oblivious stopping rules.
is the family of threshold stopping rules.
is the family of random sets whose elements are obtained by sampling independently from joint distributions. In other words, a distribution is specified by giving a positive number , a tuple of joint distributions on , and defining to be the distribution on -element sets obtained by drawing one sample independently from each of .
is the family of random sets whose elements are i.i.d. samples from an atomless distribution with and independent, i.e. a distribution on which is a product of atomless distributions.
is the union of over all .
(Samuel-Cahn (1984)) There is a prophet inequality with factor for .
The second is an improved prophet inequality for threshold stopping rules when samples are drawn i.i.d. from atomless product distributions; it can be derived as a corollary of either (Ehsani et al., 2018, Theorem 19) or (Correa et al., 2017, Corollary 2.2).
Our third prophet inequality again pertains to the case when samples are drawn i.i.d. from atomless product distributions, but it allows for oblivious stopping rules rather than threshold stopping rules. The discrete-time counterpart to this prophet inequality can be found in Hill and Kertz (1982); Kertz (1986); Correa et al. (2017).
Let be the solution to . There is a prophet inequality with factor for .
Since the distincton between discrete time and continuous time is immaterial from the standpoint of analyzing threshold stopping rules, the first two of these theorems are equivalent to the existing results for discrete-time prophet inequalities that we have cited before the theorem statements. On the other hand, because oblivious stopping rules are less powerful than general stopping rules, the third theorem is not an immediate consequence of the corresponding discrete-time prophet inequality. Proofs of all three theorems are included in Appendix A; since Theorem 3 is the only novel result among the three theorems, the proofs of the other two are included only for the purpose of making our paper self-contained.
To complete this section, we will describe the stopping rules which achieve the bounds stated in the three prophet inequalities above.
When the points are independent but not necessarily identically distributed, choose threshold to be the median of the distribution of . In other words, is defined such that the events and both have probability at most . Consider the threshold stopping rule that selects the first pair with , and consider the one whose selection criterion is . The proof of Theorem 1 shows that at least one of these two stopping rules fulfills a prophet inequality with factor 1/2.
When the points are i.i.d. and the distributions of and
are atomless and independent, with cumulative distribution functionsand , respectively, choose threshold such that . The proof of Theorem 2 shows that the threshold stopping rule that selects the first pair such that fulfills a prophet inequality with factor . Now let be the solution to
and let be the solution of the differential equation
with initial condition . The oblivious stopping rule that accepts the first such that
fulfills a prophet inequality with factor .
3.2 Reducing delegated search to prophet inequalities
Although delegated search problems and prophet inequalities appear unrelated at first glance, the tight technical connection between them is explained by an observation which is extremely natural in hindsight. Consider a change of variables that maps the agent’s utility to a point , where the function is monotonically decreasing. In a single proposal mechanism with eligible set , the agent submits the eligible proposal with the highest value. Similarly, an oblivious stopping rule with eligible set selects the earliest eligible point . Since the change of variables is monotonically decreasing, the two selection criteria are equivalent! Thus, designing single proposal mechanisms that yield high utility for the principal is equivalent to designing oblivious stopping rules that yield a high expected value.
In more detail, let be any continuous, monotonically decreasing bijection from to , for example . Under the mapping defined by , any distribution on sets of solutions induces a distribution on sets of pairs . In particular, our distributional model in which the agent draws i.i.d. samples from is mapped, under this correspondence, to a member of the family of distributions .
There is also a reverse correspondence from oblivious stopping rules to single proposal mechanisms and their interim allocation functions. The oblivious stopping rule with eligible set corresponds to the single proposal mechanism with eligible set . More precisely, if and is a best response to the single proposal mechanism with eligible set , then for any sequence of samples , we have
In other words, suppose we run the mechanism ; the agent draws a sequence of samples; and we let the agent choose the best one (for the agent) that belongs to . This procedure is equivalent to running the oblivious stopping rule on the sequence obtained by transforming all of the agents’ samples to points , and selecting the earliest such point (ordered by ) that belongs to . Under this correspondence, threshold stopping rules correspond to single proposal mechanisms in which a solution is deemed eligible if the principal’s utility exceeds a specified threshold. Note that this subset of single proposal mechanisms can be implemented even when the agent’s utility is unobservable.
In the distributional model, suppose the agent draws i.i.d. samples, and let denote the utility the principal would attain if she could directly choose her favorite among these samples.
There is always a set of the form or such that a single proposal mechanism with eligible set ensures that the principal’s expected utility is at least
If the principal and agent have independent utilities, each drawn from an atomless distribution, then a single proposal mechanism that accepts any proposal satisfying , for a suitable choice of , ensures that the principal’s expected utility is at least .
If the principal and agent have independent utilities, each drawn from an atomless distribution, and the principal can observe the agent’s utility, then a single proposal mechanism that accepts any proposal satisfying , for a suitable choice of the function , ensures that the principal’s expected utility is at least , where is the constant defined in Theorem 3.
In Appendix B we show that the bounds in all three parts of the theorem are tight with respect to the assumptions made in their respective statements.
4 Binary outcomes
Recall the binary model from Section 1: The potential solutions come from a large discrete set and the agent’s role is to explore which of these options are feasible to implement. If is feasible, it yields utility for the principal and for the agent — where the pair is commonly known to both parties — and if is infeasible it yields zero utility for both parties. To explore the feasibility of solution the agent must incur a cost of , and the probability of success is , independently of the success of other solutions. These quantities are again commonly known to both parties. We will assume that for each solution , since otherwise it is against the agent’s self-interest to explore , even if it were assured that the solution would be adopted if feasible.
4.1 Optimal search policies: Weitzman’s box problem
If the principal were conducting the search by herself (without delegation to an agent), this model would correspond to a special case of the box problem introduced by Weitzman (1979). The optimal search policy is simple but surprisingly subtle: it assigns to each option a priority satisfying — which in our case entails setting — and then explores options in decreasing order of priority, selecting the first feasible one in this ordering or stopping when all remaining unexplored options have .
Now suppose that the principal instead delegates the search to an agent who bears the cost of exploration, by running a single-proposal mechanism with eligible set . Then the agent faces a different instance of the box problem, in which the set of options is limited to , and the costs and success probabilities of the options is the same as before, but the value of option (if feasible) is rather than . This means the agent prioritizes boxes in decreasing order of rather than , and recommends the first box in this ordering that is discovered to be feasible.
To summarize, the delegated search problem in the binary model is analogous to Weitzman’s box problem, but with the important distinction that the searcher (the principal) is not allowed to choose the order in which to open the boxes. Instead the problem specifies an exogenous ordering of the boxes — corresponding to the agent’s ranking of options by decreasing — and the searcher is only free to decide which boxes in this sequence should be opened and which ones should be skipped, corresponding to the principal’s problem of choosing the set . Since this problem may be of independent interest, we devote Theorem 5 below to presenting a solution that always achieves at least half of the expected value of running the optimal search procedure that is allowed to inspect the boxes in any order it desires. Interestingly, the analysis is based on prophet inequalities, specifically Theorem 1 and its proof. It implies there is an approximately optimal mechanism with the following structure. For any half-infinite interval of the form or , let and define to be the single-proposal mechanism in which a proposal is eligible if it is feasible and belongs to .
4.2 The Box Problem with an Exogenous Ordering
In this section we recapitulate some background material about Weitzman’s (1979) box problem. In this problem555The following description constitutes a special case of Weitzman’s problem. The general case incorporates geometric time discounting and time delays. there are boxes, each containing an independent random prize. The prize in box is denoted , and the cost of opening the box is . A searcher may open any number of boxes sequentially, or may cease the search at any time and claim a prize from at most one of the open boxes. The problem is to design an optimal sequential search policy. Weitzman proves that if each box is assigned a priority defined by the equation , then the optimal sequential search policy opens boxes in decreasing order of priority, stopping at the first time when the highest prize inside an open box exceeds the highest priority of a closed box, or at the first time when the priority of every remaining closed box is negative, whichever comes sooner.
Kleinberg et al. (2016) provided a proof of optimality of Weitzman’s procedure in which the priority is interpreted as the “strike price” of a real option with fair value . An important quantity in their analysis is the “covered call value”, which is simply the random variable . We restate the following lemma666Lemma 1 of the full version of their paper, http://dx.doi.org/10.2139/ssrn.2753858. from their work.
(Kleinberg et al. (2016))
For any sequential search procedure and any box ,
let be the indicator
random variables of the event that the procedure
be the indicator random variables of the event that the procedure selects boxand the event that it opens box , respectively. The inequality
is satisfied by every search procedure, and equality holds if and only if the search procedure is non-exposed, meaning that at every sample point where .
For any sequential search procedure, the expected net value of running the procedure (i.e., the value of the selected box minus the combined cost of opening boxes) is bounded above by the expectation of the maximum covered call value, i.e.
The corollary is immediate, by summing inequality (2) over boxes .
Now consider the box problem with an exogenous ordering of boxes, where the searcher is limited to considering the boxes one by one in the specified order, and once she decides to leave a box closed or to leave the prize within unclaimed, she cannot later return to the box and open it or claim its prize. We define a type of policy that we call a -thresholding policy; the reason for the name will become apparent in the subsequent Lemma 3, which shows that these policies correspond to a threshold rule applied to the sequence of covered call values .
A -thresholding policy for the box problem with exogenous ordering is a policy that operates as follows. There is a half-infinite interval or called the target interval. The policy declines to open any box with . Otherwise, if , the policy opens the box and claims the prize inside if and only if .
Every -thresholding policy is non-exposed. The expected net value of running a -thresholding policy with target interval is exactly the same as the expected value selected by the threshold stopping rule that observes the random sequence and selects the first element of this sequence that belongs to .
The policy is non-exposed because implies , while and imply . Hence the left and right sides of (2) are equal for every box, and the net value of running the policy is , i.e. the expected covered call value of the box the policy selects. By design, the policy perfectly simulates the threshold stopping rule that chooses the first element of the sequence that belongs to ; this is because it selects the first box such that and both belong to , which is also the first box such that belongs to . ∎
For every instance of the box problem with exogenous ordering, there is a -thresholding policy whose expected net value is at least half that of Weitzman’s optimal search procedure (which endogenously selects the ordering of the boxes).
Lemma 3 reduces the analysis of -thresholding policies to a question about prophet inequalities. In particular, the expected net value of running a -thresholding policy is equal to the expected covered call value of the random element selected from the sequence by a particular threshold stopping rule. Since Samuel-Cahn’s (1984) prophet inequality (Theorem 1 above) implies that threshold stopping rules can always attain at least half the expectation of the maximum random variable in the sequence, it follows that there is a -thresholding policy whose expected net value is at least half the expectation of the maximum covered call value. creftypecap 1 ensures that the latter is an upper bound on the expected net value of Weitzman’s optimal search procedure. ∎
4.3 An approximately optimal mechanism
Recall that for a half-infinite interval or , the mechanism is defined to be the single proposal mechanism whose eligible set consists of solutions that are feasible and satisfy .
There exists a choice of such that the expected net value of mechanism — i.e., the principal’s value for adopting the agent’s proposal, if adopted, minus combined cost of all the alternatives explored — is at least half of the expected net value the principal could achieve by performing the optimal search herself (without delegation).
Convert the delegated search problem into a box problem with exogenous order, where the order is defined by sorting the solutions in non-increasing order of the agent’s priority value , and the value inside box is defined to be if turns out to be feasible, 0 otherwise.
According to Theorem 5 there exists a choice of such that the -thresholding policy with target set attains at least half the expected net value of the optimal search procedure. This thresholding policy goes through boxes in the given order, i.e. descending , and opens only those with , selecting the first one such that . Note that among the boxes which the policy opens, the first one with is also the first one corresponding to a feasible . This is because an infeasible has hence , whereas a feasible has , hence .
Recall from Section 4.1 that the agent’s best response to mechanism is to go through the elements of in decreasing order of , stopping and proposing the first one that is discovered to be feasible. This is exactly the behavior of the -thresholding policy with target set , as derived in the preceding paragraph. Hence the mechanism coupled with the agent’s best response behavior emulates the -thresholding policy which attains at least half the expected net value of the optimal search procedure. ∎
4.4 Limiting the number of samples
In some cases the number of distinct potential solutions, , may be prohibitively large, and the agent may only have the power to explore the feasibility of a limited number of them, . In this case, if the principal were to conduct the search autonomously without delegation — subject to the same costs and the same upper bound, , on the total number of solutions that can be tested for feasibility — it may require a very complex procedure. Nevertheless, we will provide in this section a simple delegated search mechanism such that it is easy for the agent to compute a search procedure that is a best response to the mechanism, and the outcome of running the mechanism with this best response attains at least of the net expected value of the (potentially complex) optimal procedure.
The key observation is the following lemma, which provides a useful upper bound on the value of running the optimal search procedure.
In the box problem with boxes, if the searcher is limited to open at most boxes before claiming a prize, then the expected net value of any search procedure is bounded above by where is the random set of boxes that the procedure opens.
Sum up the inequality (2) over all boxes and note that for , to derive
The lemma follows by noting that because . ∎
There exists a (non-random) set of cardinality , such that , where is the random set of solutions explored by the optimal search procedure subject to a contraint of exploring at most solutions.
The problem of adaptively exploring a random set of at most solutions to maximize is a special case of the stochastic monotone submodular function maximization problem studied by Asadpour and Nazerzadeh (2016), in which the role of the monotone submodular function is played by the function , and role of the matroid constraint is played by the cardinality constraint that at most elements may be probed. Theorem 1 of Asadpour and Nazerzadeh (2016), which asserts that the adaptivity gap of stochastic monotone submodular maximization is , specializes in the present case to the assertion stated in the lemma. ∎
Consider delegated search in the binary model with a constraint that no more than solutions can be examined for feasibility. There exists a mechanism that attains at least fraction of the expected net value of the optimal search procedure subject to the same limitation of examining at most solutions.
According to Lemmas 5 and 4, there is an -element set such that the optimal search procedure that is limited to explore only solutions in is able to attain at least fraction of the expected net value of the optimal search procedure that is limited to examine at most solutions but can (adaptively) choose any elements of during its search. When the set of solutions is restricted to , the constraint that at most solutions can be examined becomes irrelevant since only has elements. Thus, Theorem 6 guarantees the existence of a delegated search mechanism that is at least half as good as the optimal search procedure limited to , and is consequently at least times as good as the optimal search procedure limited to examine at most solutions. Moreover, by applying the algorithm in Asadpour and Nazerzadeh (2016) used to prove Lemma 5, we can implement this policy in polynomial time with a loss of a further additive in the approximation ratio, thus obtaining a bound of efficiently. ∎
This work was supported in part by NSF grants CCF-1512964, CCF-1740822, and SES-1741441, a grant from the MacArthur Foundation, and a Simons Investigator Award. The authors would like to thank Brendan Lucier, Jens Ludwig, Sendhil Mullainathan, Rad Niazadeh, and Glen Weyl for helpful discussions, and The Nines in Ithaca, NY, for the many deep-dish pizzas that were consumed during the course of this research.
Abolhassani et al. (2017)
Abolhassani, M., Ehsani, S., Esfandiari, H., Hajiaghayi, M., Kleinberg, R. D.,
and Lucier, B. (2017).
Beating 1-1/e for ordered prophets.
Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 61–71.
- Aghion and Tirole (1997) Aghion, P. and Tirole, J. (1997). Formal and real authority in organizations. Journal of Political Economy, 105(1):1–29.
- Alonso and Matouschek (2008) Alonso, R. and Matouschek, N. (2008). Optimal delegation. Review of Economic Studies, 75(1):259–293.
- Amador and Bagwell (2013) Amador, M. and Bagwell, K. (2013). The theory of optimal delegation with an application to tariff caps. Econometrica, 81(4):1541–1599.
- Ambrus and Egorov (2017) Ambrus, A. and Egorov, G. (2017). Delegation and nonmonetary incentives. Journal of Economic Theory, 171:101–135.
- Armstrong and Vickers (2010) Armstrong, M. and Vickers, J. (2010). A model of delegated project choice. Econometrica, 78(1):213–244.
- Asadpour and Nazerzadeh (2016) Asadpour, A. and Nazerzadeh, H. (2016). Maximizing stochastic monotone submodular functions. Management Science, 62(8):2374–2391.
- Athey et al. (2004) Athey, S., Bagwell, K., and Sanchirico, C. (2004). Collusion and price rigidity. Review of Economic Studies, 71(2):317–349.
- Correa et al. (2017) Correa, J. R., Foncea, P., Hoeksma, R., Oosterwijk, T., and Vredeveld, T. (2017). Posted price mechanisms for a random stream of customers. In Proceedings of the 2017 ACM Conference on Economics and Computation, EC ’17, Cambridge, MA, USA, June 26-30, 2017, pages 169–186.
- Ehsani et al. (2018) Ehsani, S., Hajiaghayi, M., Kesselheim, T., and Singla, S. (2018). Prophet secretary for combinatorial auctions and matroids. In Proc. 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2018), pages 700–714.
- Farrell and Katz (2006) Farrell, J. and Katz, M. L. (2006). The economics of welfare standards in antitrust. Competition Policy International, 2:3–28.
- Hill and Kertz (1982) Hill, T. P. and Kertz, R. P. (1982). Comparisons of stop rule and supremum expectations of iid random variables. The Annals of Probability, 10(2):336–345.
- Holmstrom (1977) Holmstrom, B. (1977). On Incentives and Control in Organizations. PhD thesis, Stanford University.
- Holmstrom (1984) Holmstrom, B. (1984). On the theory of delegation. In Boyer, M. and Kihlstrom, R., editors, Bayesian Models in Economic Theory, pages 115–141. Elsevier.
Kertz, R. P. (1986).
Stop rule and supremum expectations of iid random variables: a
complete comparison by conjugate duality.
Journal of Multivariate Analysis, 19(1):88–112.
- Khodabakhsh et al. (2018) Khodabakhsh, A., Pountourakis, M., and Taggart, S. (2018). Algorithmic delegation. working paper.
- Kleinberg et al. (2016) Kleinberg, R. D., Waggoner, B., and Weyl, E. G. (2016). Descending price optimally coordinates search. In Proceedings of the 2016 ACM Conference on Economics and Computation, EC ’16, Maastricht, The Netherlands, July 24-28, 2016, pages 23–24.
- Krishna and Morgan (2008) Krishna, V. and Morgan, J. (2008). Contracting for information under imperfect commitment. RAND Journal of Economics, 39(4):905–925.
- Li et al. (2017) Li, J., Matouschek, N., and Powell, M. (2017). Power dynamics in organizations. American Economic Journal: Microeconomics, 9(1):217–241.
- Melumad and Shibano (1991) Melumad, N. D. and Shibano, T. (1991). Communication in settings with no transfers. RAND Journal of Economics, 22(2):173–198.
- Nocke and Whinston (2010) Nocke, V. and Whinston, M. D. (2010). Dynamic merger review. Journal of Political Economy, 118(6):1200–1251.
- Samuel-Cahn (1984) Samuel-Cahn, E. (1984). Comparison of threshold stop rules and maximum for independent nonnegative random variables. Annals of Probability, 12(4):1213–1216.
- Weitzman (1979) Weitzman, M. L. (1979). Optimal search for the best alternative. Econometrica, 47:641–654.
Appendix A Proofs of Prophet Inequalities
a.1 Threshold stopping rules for independent distributions
In this section we provide a proof of the following theorem from Section 3.1. The theorem (stated in a different form) was originally proven by Samuel-Cahn (1984); we provide a proof here for the sake of making the paper self-contained.
There is a prophet inequality with factor for .
Consider any distribution in and let denote independent random variables representing the -coordinates of random samples from , respectively. Note that the subscripts on the variables represent the distributions from which they were sampled, not necessarily the order in which they arrive, since their corresponding time coordinates may not be in ascending order. However, this issue will be immaterial in the proof because our argument is insensitive to the arrival order of .
Let and choose a threshold defining two semi-infinite intervals
such that and . Let be the solution to the equation , and let denote the random variable defined by the following sampling process: first choose the interval with probability and with probability . Then apply the threshold stopping rule with eligible set to select an element and let denote the -coordinate of that element. We will prove that . Since is a convex combination of the expected value of applying the threshold stopping rule with eligible set and the one with eligible set , it will follow that the better of those two stopping rules fulfills a prophet inequality with factor .
To compare with we reason as follows. First, we have the following easy upper bound on .
To put a lower bound on , let denote an indicator random variable for the event that . Note that this event, when it happens, implies that .
|because by our construction of the random set|
a.2 Threshold stopping rules for i.i.d. atomless distributions
If is a sequence of i.i.d. random variables, each sampled from an atomless distributions, and is a threshold stopping rule such that , then
Let . We have
To compare the integrands at any specified , we consider the cases and separately. (The case is omitted because it contributes zero to both integrals.) When the inequality is satisfied whenever . Since we are assuming are identically distributed, and the threshold stopping rule has the same behavior at every point in time, we have
Hence the probability that the stopping rule does not stop at any time is
and therefore for ,
Meanwhile, for , the probability that the stopping rule stops at time and selects an element of value greater than is . Summing over , we have
Using the fact that , we find that
a.3 Oblivious stopping rules applied to i.i.d. atomless product distributions
Finally, in this subsection we furnish the proof of Theorem 3 from Section 3.1. The oblivious stopping rule that fulfills a prophet inequality with factor is more complicated than the threshold stopping rules analyzed earlier. It is defined by first solving a differential equation
with initial condition to define a function mapping to . The constant is chosen so that . Since the differential equation (9) implies
for all such that is defined, the boundary condition requires the equation
to be satisfied, and we treat this equation as the definition of .
Denoting the cumulative distribution functions of and by and , respectively, we will be analyzing the oblivious stopping rule which selects the earliest satisfying
We will prove that this rule satisfies a prophet inequality with a specific factor to be determined later. (Equation (26) below defines .) Let denote the -coordinate of the element selected by the oblivious stopping rule, and let denote the random variable . As in the proof of Lemma 6, we will make use of the equations
which reduces the task of proving a prophet inequality with factor to the task of proving
At this point a couple of observations will slightly simplify the analysis. If we change coordinates to replace with this has no effect on the behavior of the stopping rule, and of course if has no effect on , so we are free to adopt this reparameterization and assume henceforth that
is uniformly distributed in. In particular, this means the stopping rule simplifies to choosing the first such that The next simplification comes from introducing the variable and writing the event in the form . The probability of this event is