# Eliciting Forecasts from Self-interested Experts: Scoring Rules for Decision Makers

Scoring rules for eliciting expert predictions of random variables are usually developed assuming that experts derive utility only from the quality of their predictions (e.g., score awarded by the rule, or payoff in a prediction market). We study a more realistic setting in which (a) the principal is a decision maker and will take a decision based on the expert's prediction; and (b) the expert has an inherent interest in the decision. For example, in a corporate decision market, the expert may derive different levels of utility from the actions taken by her manager. As a consequence the expert will usually have an incentive to misreport her forecast to influence the choice of the decision maker if typical scoring rules are used. We develop a general model for this setting and introduce the concept of a compensation rule. When combined with the expert's inherent utility for decisions, a compensation rule induces a net scoring rule that behaves like a normal scoring rule. Assuming full knowledge of expert utility, we provide a complete characterization of all (strictly) proper compensation rules. We then analyze the situation where the expert's utility function is not fully known to the decision maker. We show bounds on: (a) expert incentive to misreport; (b) the degree to which an expert will misreport; and (c) decision maker loss in utility due to such uncertainty. These bounds depend in natural ways on the degree of uncertainty, the local degree of convexity of net scoring function, and natural properties of the decision maker's utility function. They also suggest optimization procedures for the design of compensation rules. Finally, we briefly discuss the use of compensation rules as market scoring rules for self-interested experts in a prediction market.

Comments

There are no comments yet.

## Authors

• 46 publications
• ### From Proper Scoring Rules to Max-Min Optimal Forecast Aggregation

This paper forges a strong connection between two seemingly unrelated fo...
02/14/2021 ∙ by Eric Neyman, et al. ∙ 0

read it

• ### Accurate, Data-Efficient Learning from Noisy, Choice-Based Labels for Inherent Risk Scoring

Inherent risk scoring is an important function in anti-money laundering,...
11/27/2018 ∙ by W. Ronny Huang, et al. ∙ 0

read it

• ### A Method for Integrating Utility Analysis into an Expert System for Design Evaluation

In mechanical design, there is often unavoidable uncertainty in estimate...
03/20/2013 ∙ by Deborah L. Thurston, et al. ∙ 0

read it

• ### Strategy-Proof Incentives for Predictions

Our aim is to design mechanisms that motivate all agents to reveal their...
05/13/2018 ∙ by Amir Ban, et al. ∙ 0

read it

• ### How to Amend a Constitution? Model, Axioms, and Supermajority Rules

A self-governed community or society must have rules by which group deci...
11/05/2020 ∙ by Ben Abramowitz, et al. ∙ 0

read it

• ### The Strategy of Experts for Repeated Predictions

We investigate the behavior of experts who seek to make predictions with...
10/02/2017 ∙ by Amir Ban, et al. ∙ 0

read it

• ### Aim for clinical utility, not just predictive accuracy

The predictions from an accurate prognostic model can be of great intere...
08/29/2019 ∙ by Michael C Sachs, et al. ∙ 0

read it

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Eliciting predictions of uncertain events from experts

or other knowledgeable agents—or relevant information pertaining to events—is a fundamental problem of study in statistics, economics, operations research, artificial intelligence and a variety of other areas

[16, 5]. Increasingly, robust mechanisms for prediction are being developed, proposed and/or applied in real-world domains ranging from elections and sporting events, to events of public interest (e.g., disease spread or terrorist action), to corporate decision making. Indeed, the very idea of crowd-sourcing and information (or prediction) markets is predicated on the existence of practical mechanisms for information elicitation and aggregation.

A key element in any prediction mechanism involves providing an expert agent with the appropriate incentives to reveal a forecast they believe to be accurate. Many forms of “outcome-based” scoring rules, either individual or market-based, provide experts with incentives to (a) provide sincere forecasts; (b) invest effort to improve the accuracy of their personal forecasts; and (c) participate in the mechanism if they believe they can improve the quality of the principal’s forecast. However, with just a few exceptions (see, e.g., [15, 13, 3, 7], most work fails to account for the ultimate use to which the forecast will be put. Furthermore, even these models assume that the experts who provide their forecasts derive no utility from the final forecast, or how it will be used, except insofar as they will be rewarded by the prediction mechanism itself.

In many real-world uses of prediction mechanisms, this assumption is patently false. Setting aside purely informational and entertainment uses of information markets, the principal is often interested in exploiting the elicited forecast in order to make a decision [10, 13, 3]

. In corporate prediction markets, the principal may base strategic business decisions on internal predictions of uncertain events. In a hiring committee, the estimated probability of various candidates accepting offers (and being given offers by competitors) will influence the order in which (and whether) offers are made. Of course, other examples abound. Providing appropriate incentives in the form of scoring rules is often difficult in such settings, especially when the outcome distribution is conditional on the decision ultimately taken by the principal

[13, 3]. However, just as critically, in these settings, the experts whose forecasts are sought often have their own interests

in seeing specific decisions being taken, interests that are not (fully) aligned with those of the principal. For example, in a corporate setting, an expert from a certain division may have an incentive to misreport demand for specific products, thus influencing R&D decisions that favor her division. In a hiring committee, an committee member may misreport the odds that a candidate will accept a competing position in order to bias the “offer strategy” in a way that favors his preferred candidate.

In this work, we develop what we believe to be the first class of scoring rules that incentivizes truthful forecasts even when experts have an interest in the decisions taken by the principal, and hence would like to provide forecasts that manipulate that decision “in their favor.” Other work has studied both decision making and incentive issues in prediction markets, but none that we are aware of addresses the natural question of expert self-interest in the decisions of the principal.

Hanson [10] introduced the term decision markets to refer to the broad notion of prediction markets where experts offer forecasts for events conditional on some policy being adopted or a decision being taken. Othman and Sandholm [13] provide the first explicit, formal treatment of a principal who makes decisions based on expert forecasts. They address several key difficulties that arise due to the conditional nature of forecasts, but assume that the experts themselves have no direct interest in the decision that is taken. Chen and Kash [3] extend this model to a wider class of informational settings and decision policies. Dimitrov and Sami [7] consider the issue of strategic behavior across multiple markets and the possibility that an expert may misreport her beliefs in one market to manipulate prices (and hence gain some advantage) in another. Similarly, Conitzer [6] explores strategic aspects of prediction markets through their connections to mechanism design. While the mechanism design approach could prove very useful for the problems we address (see concluding remarks in Sec. 6), Conitzer assumes an expert’s utility is derived solely from the payoff provided by the prediction mechanism. Also related to the model we develop here is the analysis of Shi et al. [15], who consider experts that, once they report their forecasts, can take action to alter the probabilities of the outcomes in question. Unlike our model, they do not consider expert utility apart from the payoff offered by the mechanism (though, as in our model, the principal does have a utility function that dictates the value of an expert report).

Our basic building block is a scoring rule for a single expert who knows the principal’s policy—i.e., mapping from forecasts to decisions—and where the principal knows the expert’s utility for decisions. We show that the scoring rule must compensate the expert in a very simple and intuitive way based on her utility function. Specifically, the principal uses a compensation function that, when added to the inherent utility the expert derives from the principal’s decision, induces a proper scoring rule. In a finite decision space, an expert’s optimal utility function is piecewise-linear and convex in probability space—we describe one natural scoring rule based on this function that is proper, but not strictly so. We then provide a complete characterization of all proper compensation functions. We also characterize those which, in addition, satisfy weak and strong participation constraints that ensure an expert will be sufficiently compensated to “play the game.”

We then provide a detailed analysis of both expert uncertainty in the principal’s policy, and principal uncertainty in the expert’s utility for decisions. First we observe that the expert need not know the principal’s policy prior to providing her forecast as long as she can verify the decision taken after the fact. Second, we analyze the impact of principal uncertainty regarding the expert’s utility function. In general, the principal cannot ensure truthful reporting. However, we show that, given bounds on this uncertainty, bounds can then be derived on all of the following: (i) the expert’s incentive to misreport; (ii) the deviation of the expert’s misreported forecast from its true beliefs; and (iii) the loss in utility the principal will realize due to this uncertainty. The first two bounds rely on the notion of strong convexity of the net scoring function induced by the compensation rule. The third bound uses natural properties of the principal’s utility function. Apart from bounds derived from global strong convexity, we show that these bounds can be significantly tightened using local strong convexity, specifically, by ensuring merely that the net scoring function is sufficiently (and differentially) strongly convex near the decision boundaries of the principal’s policy. These bounds suggest computational optimization methods for for designing compensation rules (e.g., using splines related to the principal’s utility function). We conclude by briefly discussing a market scoring rule (MSR) based on our one-shot compensation rule. Using this MSR, the principal may need to provide more generous compensation to each expert than in the one-shot case, simply to ensure participation; but in some special cases, no additional compensation is needed.

The paper is organized as follows. We begin with a basic background on scoring rules for prediction mechanisms in Sec. 2. In Sec. 3 we define our model for analyzing the behavior of self-interested experts, introduce compensation rules, and show that the resulting net scoring function can be used to analyze expert behavior. We provide a complete characterization of (strict) proper compensation rules and and further characterize two subclasses of compensation rules that satisfy the two participation constraints mentioned above. In Sec. 4 we relax two assumptions in our model. We first show the expert need to be aware of the principal’s policy for our model to work. We then consider a principal that has imperfect knowledge of the expert’s utility function, and using the notion of (local and global) strong convexity derives bounds on the expert’s incentive to misreport and the impact on the quality of the principal’s decision. After a brief discussion of market scoring rules in Sec. 5, we conclude in Sec. 6 with a discussion of several avenues for future research.

## 2 Background: Scoring Rules

We begin with a very brief review of relevant concepts from the literature on scoring rules and prediction markets. For comprehensive overviews, see the surveys [16, 5].

We assume that an agent—the principal

—is interested in assessing the distribution of some discrete random variable

with finite domain . Let denote the set of distributions over , where

is a nonnegative vector

s.t. . The principle can engage one or more experts to provide a forecast . We focus first on the case of a single expert . For instance, to consider a simple toy example we use throughout the sequel, the mayor of a small town may ask the local weather forecaster to offer a probabilistic estimate of weather conditions for the following weekend.111More significant examples in the domains of public policy or corporate decision making, as discussed above, can easily be constructed by the reader.

We assume has beliefs about , but a key question is how to incentivize to report faithfully (and devote reasonable effort to developing accurate beliefs). A variety of scoring rules have been developed for just this purpose [2, 14, 12, 8]. A scoring rule is a function that provides a score (or payoff) to if she reports forecast and the realized outcome of is , essentially rewarding for her predictive “performance” [2, 12]. If has beliefs and reports , her expected score is . We say is a proper scoring rule iff a truthful report is optimal for :

 S(p,p)≥S(r,p),∀p,r∈Δ(X) (1)

We say that is strictly proper if inequality (1) is strict for (i.e., has strict disincentive to misreport). A variety of strictly proper scoring rules have been developed, among the more popular being the log scoring rule, where (for arbitrary constants and ) [12, 14]. In what follows, we will restrict attention to regular scoring rules in which payment is bounded whenever .

Proper scoring rules can be fully characterized in terms of convex cost functions [12, 14]; here we review the formulation of Gneiting and Raftery [8]. Let be any convex function over distributions—we refer to as a cost function. We denote by some subgradient of , i.e., a function satisfying

 G(q)≥G(p)+G∗(p)⋅(q−p)

for all .222If is differentiable at then the subgradient at that point is unique, namely, the gradient . Such cost functions and associated subgradients can be used to derive any proper scoring rule.

###### Theorem 1

[12, 14, 8] A regular scoring rule is proper iff

 (2)

for some convex and subgradient . is strictly proper iff is strictly convex.

Intuitively, Eq. 2

defines a hyperplane

 Hp=⟨S(p,x1),…,S(p,xm)⟩,

for each point , that is subtangent to at . This defines a linear function, for any fixed report , giving the expected score of that report given beliefs : . An illustration is given in Fig. 1 for a simple one-dimensional (two-outcome) scenario.

There are a number of prediction market mechanisms that allow the principal to extract information from multiple experts; see [16, 5] for excellent surveys. Here we focus on market scoring rules (MSRs) [9, 11], which allow experts to (sequentially) change the forecasted using any proper scoring rule . Given the current forecast , an expert can change the forecast to if she is willing to pay according to and receive payment . If her true beliefs are different from and the scoring rule is strictly proper, then she has incentive to participate and report truthfully. Under certain conditions, MSRs can be interpreted as automated market makers [4]. Since each expert pays the amount due to the previous expert for her prediction, the net payment of the principal is the score associated with the final prediction.

## 3 Scoring Rules for Self-interested Experts

Scoring rules in standard models assume that an expert offering a forecast is uninterested in any aspect her forecast other than the score she will derive from her prediction. As discussed above, there are many settings where the principal will make a decision based on the received forecast, and the expert has a direct interest in this decision. In this section, we develop a model for this situation and devise a class of scoring rules that incentive self-interested agents to report their true beliefs.

### 3.1 Model Formulation

We assume the principal, or decision maker (DM), will elicit a forecast of from expert , and make a decision that is influenced by this forecast. Let be the set of possible decisions, and be DM’s utility should it take decision with being the realization of . Letting , the expected utility of decision given distribution is . For any beliefs , DM will want to take the decision that maximizes expected utility, giving DM the utility function . Since each is a linear function of , is piecewise linear and convex (PWLC). Furthermore, each is optimal in a (possibly empty) convex region of belief space We assume DM acts optimally and that it has a policy that selects some optimal decision for any expert forecast . In what follows, we take . We denote by the (possibly empty) boundary between and . Notice that for any we must have .333Assuming “ties” at boundaries are broken consistently, the regions will be convex, but possibly open.

In our running example, suppose the mayor must decide whether to hold a civic ceremony in the town park or at a private banquet facility. Given a forecast probability of rain, she will make outdoor arrangements at the park if falls below some threshold , and will rent the banquet facility if is above (here is the indifference probability: ).

We note that this model of DM utility is slightly more restricted than that of [13, 3], who allow the utility of each decision to depend on a different random variable, and assume that the realization of a variable will be observed only if the corresponding decision is taken. This introduces difficulties in offering suitable incentives for participation that do not arise in our setting; indeed, the primary contribution of Othman and Sandholm [6], and the extension by Chen and Kash [3], is a characterization of a form of proper scoring in the face of these complications. We also confine our attention primarily to a principal that maximizes its expected utility given ’s report (in the terminology of [13], DM uses the max decision rule), though we remark on the possible use of stochastic policies by DM in Sec. 4.2.

Now suppose a single expert is asked to provide a forecast of that permits DM to make a decision. Assume that knows DM’s policy : knowledge of DM’s utility function is sufficient for knowledge of the policy but is not required (we discuss the possibility of being uncertain about in Sec. 4.1). Further, assume that has its own utility function or bias , where is ’s utility should DM take decision and is the realization of . Define ; and let ’s expected utility for given be . In our small example, the weather forecaster may be related to the owner of the banquet facility, and will get some degree of satisfaction (or a small kickback) if the mayor’s ceremony is held there.

As with DM, ’s optimal utility function (if DM were acting on ’s behalf) is PWLC:

 B∗(p)=maxibi⋅p. (3)

Denote by the decision that maximizes Eq. 3, i.e., ’s preferred decision given beliefs (see Fig. 2 for an illustration).

Of course, DM is pursuing its own policy , not acting to optimize ’s utility. Hence ’s actual utility for a specific report under beliefs is given by

 Bπ(r,p)=bπ(r)⋅p; (4)

that is, if she reports , DM will take decision for some , and she will derive benefit . We refer to as ’s inherent utility for reporting . Similarly, is ’s inherent utility for report under realization . This is simply the inherent benefit she derives from the decision she induces DM to take. This is illustrated in Fig. 2. Notice that ’s utility for reports, given any fixed beliefs

, is not generally continuous, with potential (jump) discontinuities at DM’s decision boundaries.

Without some scoring rule, there is a clear incentive for to misreport its true beliefs to induce DM to take a decision that prefers, thereby causing DM to take a suboptimal decision. For instance, in Fig. 2, if ’s true beliefs lie in , its preferred decision is ; but truthful reporting will induce DM to take decision . has greater inherent utility for reporting (any) . Indeed, its gain from the misreport is . Equivalently, stands to lose by reporting truthfully. Intuitively, a proper scoring rule would remove this incentive to misreport.

### 3.2 Compensation Rules

If DM knows utility function, it could reason about ’s incentive to misreport and revise its decision policy accordingly. Of course, this would naturally lead to a Bayesian game requiring analysis of its Bayes-Nash equilibria, and generally leaving DM with uncertainty about ’s true beliefs.444See Dimitrov and Sami [7] and Conitzer [6] for just such a game-theoretic treatment of prediction markets (without decisions). Instead, we wish to derive a scoring rule that DM can use to incentivize to report truthfully.

A compensation function is a mapping from reports and outcomes into payoffs, exactly like a standard scoring rule. Unlike a scoring rule, however, does not fully determine ’s utility for a report; one must also take into account the inherent utility derives from the decision it prompts the DM to take. Any compensation function induces a net scoring function:

 S(p,xi) =C(p,xi)+Bπ(p,xi) (5)

’s expected net score for report under beliefs is , where is ’s expected compensation.

One natural way to structure the compensation function is to use to compensate for the loss in inherent utility incurred by reporting its true beliefs (relative to its best report). This would remove any incentive for to misreport. We define a particular compensation function that accounts for this loss:

 (6)

is simply the difference between ’s realized utility for its optimal decision (relative to its report ) and the actual decision she induced.

does not satisfy the usual properties of scoring rules: it is given by a subgradient of the loss function, which is not convex, nor even continuous. However,

’s payoff for a report consists of both this compensation and its inherent utility, i.e., her net score:

 S1(p,xi) =C1(p,xi)+Bπ(p,xi) (7) (8) =bi,D∗(p) (9)

’s expected net score under beliefs is identical to her expected utility for the optimal decision . Hence, no other report can induce a decision that gives her greater utility. Informally, this shows that truthful reporting is optimal. It can be seen directly by observing that the net score can be derived from Eq. 2 by letting be ’s optimal utility function (which is PWLC, hence convex), and using the subgradient given by the hyperplane corresponding to the optimal decision at that point.555At interior points of ’s decision regions, the hyperplane is the unique subgradient. At ’s decision boundaries, an arbitrary subgradient can be used.

###### Definition 2

A compensation function is proper iff the expected net score function satisfies for all . is strictly proper if the inequality is strict.

We don’t prove this formally since we prove a more general result below, but the above informal argument shows:

###### Proposition 3

Compensation function is proper.

###### Remark 4

We’ve defined the compensation function using the space of all decisions . However, this may cause DM to compensate for decisions it will never take. If we restrict attention to those decisions in the range of DM’s policy , then the above characterization still applies (and will typically reduce total compensation). In what follows, we assume the set of decisions has been pruned to include only those for which , and that ’s utility function is defined relative to that set.

Compensation function , while proper, is not strictly proper. The induced net scoring function is characterized by a non-strictly convex cost function , since . In particular, for any region of belief space where a single decision is optimal for , every report has the same expected net score, hence there is no “positive” incentive for truthtelling.

While gives us one mechanism for proper scoring with self-interested experts, we can generalize the approach to provide a complete characterization of all proper (and strictly proper) compensation functions. We derived by compensating for its loss due to truthful reporting. This approach is more “generous” than necessary. Rather than compensating for its loss, we need only remove the potential gain from misreporting. The key component of is not the “compensation term” , but rather the the penalty term . It is this penalty that prevents from benefiting by changing DM’s decision. Any such gain is subtracted from its compensation by the inclusion of . We insist only that the positive compensation term is convex: it need bear no connection to ’s actual utility function to incentivize truthfulness.666Incentive to participate is discussed below. Indeed, we can fully characterize the space of proper and strictly proper compensation functions:

###### Theorem 5

A compensation rule is proper for iff

 C(p,xi)=G(p)−G∗(p)⋅p+G∗i(p)−bi,π(p) (10)

for some convex function , and subgradient of . is strictly proper iff is strictly convex.

Suppose is given by Eq. 10. ’s utility for a report given outcome is given by its net score:

 S(p,xi) =C(p,xi)+Bπ(p,xi)

Since satisfies the conditions of Thm. 1, the standard proof of propriety of can be used. Similarly, if is strictly convex, is strictly proper.

Conversely, suppose is proper (so that the induced net score satisfies ). Define . If is proper in this sense, it is easy to show that is a convex function of (for fixed beliefs ); and since is the maximum of a set of convex functions (where the last equality holds because is proper), is itself convex. Thm. 1 (or more precisely the method used to prove it) ensures that, for some subgradient of , we have . Hence,

 C(p,xi) =S(p,xi)−Bπ(p,xi)

so has the required form. If is strictly proper, then must be strictly convex and Thm. 1 can again be applied. An illustration of a cost function that gives rise to a proper compensation function is shown in Fig. 3(a).

The characterization of Thm. 5 ensures truthful reporting, but may not provide incentives for participation. Indeed, the expert may be forced to pay the DM in expectation for certain beliefs. Specifically, if , ’s expected compensation is negative. Unless the DM can “force” to participate, this will cause to avoid providing a forecast if its beliefs are (e.g., see point is Fig. 3(a)). In general, we’d like to provide with non-negative expected compensation. We can do this by insisting that the compensation rule weakly incentives participation:

###### Definition 6

A compensation function satisfies weak participation iff for any beliefs , ’s expected compensation for truthful reporting is non-negative.

(See Fig. 3(b) for an illustration of a cost function that induces a compensation rule satisfying weak participation.)

###### Theorem 7

A proper compensation rule satisfies weak participation iff it meets the conditions of Thm. 5 and for all .

The proof is straightforward: if for all , then for any truthful report ’s expected compensation is . Conversely, if for some , then if holds beliefs , a truthful report has negative expected compensation .

While weak participation seems desirable, even it is not strong enough to ensure an expert’s participation in the mechanism in general. Suppose we define a compensation function using some convex cost function . If participates, she will maximize her net payoff by reporting her true beliefs, say, . But suppose that . While may not be certain how DM will act without its input (e.g., she may not know DM’s “default beliefs” precisely), she may nevertheless have beliefs about DM’s default policy. And, if believes DM will take decision if she provides no forecast, then she will be better off not participating and taking the expected payoff derived solely from her inherent utility, and forego participation in the mechanism (which limits her expected payoff to ). (See point in Fig. 3(b).) To prevent this we can require that strongly incentivize participation, by insisting no matter what believes about DM’s default policy (i.e., its action given no reporting), it will not sacrifice expected utility by participating in the mechanism.

###### Definition 8

A compensation function satisfies strong participation iff, for any decision , for any beliefs , ’s net score for truthful reporting is no less than .

Strong participation means that has no incentive to abstain from participation (and need not “take its chances” that DM will make a decision it likes). This definition is equivalent to requiring that ’s expected utility for truthful reporting, as a function of is at least as great as her optimal utility function, i.e., for all . Fig. 3(c) illustrates such a compensation rule.

###### Theorem 9

Proper compensation rule satisfies strong participation iff it meets the conditions of Thm. 5 and for all .

The proof is straightforward. Suppose for all . If holds beliefs , then a truthful report has expected net score of , and for no beliefs about DM’s default policy can derive higher utility by not participating. Conversely, suppose for some . If holds beliefs and also believes that DM will take action if does not report, then will derive utility by not participating, better than the optimal expected score from participating.

###### Observation 10

Compensation rule is the unique minimal (non-strictly) proper rule satisfying strong participation. That is, no compensation rule offers lower compensation for any report without violating strong participation.

In general, if we insist on strong participation, DM must provide potential compensation up to the level of ’s maximum utility gap:

 g(B)=maxi≤m,j,k≤nbik−bij.

However, this degree of compensation is needed only if DM and have “directly conflicting” interests (i.e., DM takes a decision whose realized utility is as far from optimal as possible from ’s perspective). In such cases, one would expect ’s utility to be significantly less than DM’s. If not, this compensation would not be worthwhile for DM. Conversely, if ’s interests are well aligned with those of DM, the total compensation required will be small. The most extreme case of well-aligned utility is one where functions and coincide, i.e., for all beliefs , in which case, no compensation is required. Specifically, compensation function for all ; and while is not strictly proper, the only misreports that will contemplate (i.e., that do not reduce its net score) are those that cannot change DM’s decision (i.e., cannot impact DM’s utility). As a consequence, DM should elicit forecasts from an expert who either (a) has well-aligned interests in the decisions being contemplated; (b) has interest whose magnitude is small (hence requires modest compensation) relative to DM’s own utility; or (c) can be “forced” to make a prediction (possibly at negative net cost).777For instance, managers may require forecasts from expert employees under conditions of negative expected cost.

## 4 Policy and Utility Uncertainty

We now relax two key assumptions underlying our compensation rule from Section 3.1: that knows DM’s policy, and that DM knows ’s utility function.

### 4.1 Policy Uncertainty

We first consider the case where DM does not want to disclose its policy to . For example, suppose DM wanted to forego a truthful compensation rule and simply rely on a proper scoring rule of the usual form that ignores the ’s inherent utility. Thm. 5 shows that DM cannot prevent misreporting in general if it ignores ’s inherent utility; hence it can suffer a loss in its own utility. However, by refusing to disclose its policy , DM could reduce the incentive for to misreport. Without accurate knowledge of , would be forced to rely on uncertain beliefs about to determine the utility of a misreport, generally lowering its incentive. However, this will not remove the misreporting incentive completely. For instance, referring to Fig. 2, suppose DM does not disclose . If believes with sufficient probability that the decision boundary between and is located at the point indicated, it will misreport any forecast in region sufficiently close to that boundary should DM use a scoring rule rather than a compensation rule. As such, refusing to disclose its policy can be used to reduce, but not eliminate, the incentive to misreport if DM does not want to use a proper compensation rule.888A similar argument shows that a stochastic policy can be used to reduce misreporting incentive, e.g., the soft max policy that sees DM take decision with probability proportional to . .

Our analysis in the previous section assumed that used it knowledge of to determine the report that maximizes her net score. However, DM does not need to disclose to make good use of a compensation rule. It can specify a compensation rule implicitly by announcing its net scoring function (or the cost function and subgradient ) and promising to deduct from this score for whatever decision it ultimately takes. need not know in advance what decision will be taken to be incentivized to offer a truthful forecast. Nor does ever need to know what decisions would have been taken had it reported differently. Thus the only information needs to learn about is the value of at its reported forecast ; and even this need not be revealed until after the decision is taken (and its outcome realized).999Some mechanism to verify the decision post hoc may be needed in some circumstances, but this is no different than requiring verification of the realized outcome in standard models of scoring rules.

### 4.2 Uncertainty in Expert Utility

We now consider the more interesting issues that arise when DM is uncertain about the parameters of ’s utility function. If the DM has a distribution over , one obvious technique is to specify a proper compensation rule using the expectation of . This may work reasonably well in practice, depending on the nature of the distribution; but it follows immediately from Thm. 5 that this approach will not induce truthful reporting in general.

Rather than analyzing probabilistic beliefs, we instead suppose that DM has constraints on that define a bounded feasible region in which ’s utility parameters must lie. We will confine our analysis to a simple, but natural class of constraints, specifically, upper and lower bounds on each utility parameter; i.e., assume DM has upper and lower bounds and , respectively, on each . This induces a hyper-rectangular feasible region . If is a more general region (e.g., a polytope defined by more general linear constraints), our analysis below can be applied to the tightest “bounding box” of the feasible region.101010General linear constraints on ’s parameters could be could be inferred, for example, from observed behavior. Again by Thm. 5 it is clear that DM cannot define a proper compensation rule in general: without certain knowledge of ’s utility, any proposed “deduction” of inherent utility from ’s compensation could mistaken, leading to an incentive to misreport. However, this incentive can be bounded.

Under conditions of utility uncertainty, it is natural for DM to restrict its attention to “consistent” compensation rules:

###### Definition 11

Let be the set of feasible expert utility functions. A compensation rule is consistent with iff it has the form, for some (strictly) convex :

 C(p,xi)=G(p)−G∗(p)⋅p+G∗i(p)−~bi,π(p) (11)

for some .

Notice that consistent compensation rules are naturally linear: intuitively, we select a single consistent estimate of each parameter , treat as if this were her true (linear) utility function, and define using this estimate. Let’s say DM is -certain of ’s utility iff for all . Then we can bound the incentive for to misreport as follows:

###### Theorem 12

If DM is -certain of ’s utility, then ’s incentive to misreport under any consistent compensation rule is bounded by . That is, .

Let be ’s actual beliefs and some report.

 S(r,p) =[G(r)−~bπ(r)+bπ(r)]⋅p ≤G(r)⋅p+δ ≤G(p)⋅p+δ ≤[G(p)−~bπ(p)+bπ(p)+δ]⋅p+δ ≤S(p,p)+2δ

Notice that the proof assumes that: (a) the estimated utility for the decision induced by ’s report underestimates her true utility by ; and (b) the estimated utility for the optimal decision overestimates ’s true utility by . We can limit the misreporting incentive further by using a uniform compensation rule.

###### Definition 13

A consistent compensation rule is uniform if each parameter is estimated by for some fixed .

For example, if DM uses the lower bound (or midpoint, or upper bound, etc.) of each parameter interval uniformly, we call its compensation rule uniform.

###### Corollary 14

If DM is -certain of ’s utility, then ’s incentive to misreport under any uniform compensation rule is bounded by . That is, .

While bounding the incentive to misreport is somewhat useful, it is more important to understand the impact such misreporting can have on DM. Fortunately, this too can be bounded. The (strict) convexity of means that the greatest incentive to misreport occurs at the decision boundaries of DM’s policy in Thm. 12. Since, by definition, DM is indifferent between the adjacent decisions at any decision boundary, misreports in a bounded region around decision boundaries have limited impact on DM’s utility, as we now show. Specifically, we show that the amount by which will misreport is bounded using the “degree of convexity” of the cost function , which in turn bounds how much loss in utility DM will realize.

###### Definition 15

Let be a convex cost function with subgradient . We say is robust relative to with factor iff, for all :111111The definition of -robustness can be recast using any reasonable metric, e.g., -norm or KL-divergence; but the -norm is most convenient below when we relate robustness to strong convexity.

 G(q)≥G(p)+G∗(p)⋅(q−p)+m||q−p||2 (12)

It is not hard to see that -robustness of the pair imposes a minimum “penalty” on any expert misreport, as a function of its distance from her true beliefs:

###### Observation 16

Let be a proper compensation rule based on an -robust cost function and subgradient . Let be the induced net scoring function. Then

 S(p,p)−S(q,p)≥m||q−p||2.

Together with Thm. 12, this gives a bound on the degree to which an expert will misreport when an uncertain DM uses a consistent compensation rule.

###### Corollary 17

Let DM be -certain of ’s utility and use a consistent compensation rule based on an -robust cost function and subgradient. Let be ’s true beliefs. Then the report that maximizes ’s net score satisfies . If the compensation rule is uniform, then .

In other words, ’s utility-maximizing report must be within a bounded distance of her true beliefs if DM uses an -robust cost function to define its compensation rule.

The notion of -robustness is a slight variant of the notion of strong convexity [1] in which we use the specific subgradient to measure the “degree of convexity.” In the specific case of twice differentiable cost function , we say is strongly convex with factor iff for all ; i.e., if the matrix is positive definite [1].121212Alternative definitions exist for non-differentiable , but we assume a twice differentiable when discussing strong convexity and use robustness relative to a specific subgradient for non-differentiable . -convexity is a sufficient condition for the robustness we seek.

###### Corollary 18

Let DM be -certain of ’s utility and use a consistent compensation rule based on an -convex, twice differentiable cost function . Let be ’s true beliefs. Then the report that maximizes ’s net score satisfies . If the compensation rule is uniform, then .

’s assumed differentiability ensures its gradient is the unique subgradient. Since is -convex, we have

 G(q)≥G(p)+∇GT(p)(q−p)+m2||q−p||22

for all (see [1]). Hence ’s loss in compensation is at least . Since its gain in inherent utility by misreporting is bounded by (Thm. 12), setting the former to be no greater than yields the result. Since the gain in inherent utility under a uniform compensation rule is , the stronger bound follows by substituting for in the preceding argument.

Robustness—and strong convexity if we use a differentiable cost function—allow us to globally bound the maximum degree to which will misreport. This allows us to give a simple, global bound on the loss in DM utility that results from its uncertainty about the expert’s utility function. Recall that DM’s utility function for any decision is linear, hence has a constant gradient . (We abuse notation and simply write for .) The function is also linear, given by parameter vector . Let denote the -dimensional unit vector with a 1 in component and zeros elsewhere.

###### Theorem 19

Let DM be -certain of ’s utility and use a consistent compensation rule based on an -robust cost function and subgradient. Assume reports to maximize her net score. Then DM’s loss in utility relative to a truthful report by is at most . If the compensation rule is uniform, then the bound is tightened by a factor of two.

By Cor. 17, ’s utility maximizing report has an distance at most from her true beliefs . By the Cauchy-Schwartz inequality we have , hence bounding its max -deviation at . Then DM’s loss for any (utility-maximizing) misreport is:

 ud(p) ⋅p−ud(q)⋅p =ud(p)⋅q+∇Ud(p)(p−q)−ud(q)⋅q−∇Ud(q)(p−q) ≤∇Ud(p)(p−q)−∇Ud(q)(p−q) ≤∇[Ud(p)−Ud(q)](p−q) ≤maxi,j∇(Ui−Uj)(p−q) ≤maxk[eTkmaxi,j∇(Ui−Uj)]√n2δm.

Here the first inequality holds by virtue of (since is DM’s optimal decision at ).

The same proof can be adapted to strongly convex cost functions.

###### Corollary 20

Let DM be -certain of ’s utility and use a linear compensation rule based on an -convex, twice differentiable cost function . Assume reports to maximize her net score. Then DM’s loss in utility relative to a truthful report by is at most . If the compensation rule is uniform, then the bound is tightened by a factor of two.

The results above all rely on the global robustness or global strong convexity of the cost function . Designing a specific cost function (and if not differentiable, choosing its subgradients) can be challenging if we try to ensure uniform -robustness or -convexity across the entire probability space . But recall that can only impact DM’s utility if its misreport causes DM to change its decision. This means that the cost function need only induce strong penalties for misreporting near decision boundaries. Furthermore, the strength of these penalties should be related to the rate at which DM’s utility is negatively impacted. For example, suppose lies on the decision boundary between region and . If is large, then a misreport in the region around will cause a greater loss in utility than if is small. This suggests that the cost function should be more strongly convex (or more robust) near decision boundaries whose corresponding decisions differ significantly in utility, and can be less strong when the decisions are “similar.” See Fig. 4 for an illustration of this point.

Furthermore, the cost function need only be robust or strongly convex in a local region around these decision boundaries. In particular, suppose is -robust in some local region around the decision boundary between and . The degree of robustness bounds the maximum deviation from truth that will contemplate. If the region of -robustness includes these maximal deviations, that will be sufficient to bound DM’s utility loss for any true beliefs has in that region. Outside of these regions, no misreport by will cause DM to change its decisions (relative to a truthful report).

We can summarize this as follows:

###### Definition 21

is locally robust relative to in the -neighborhood around with factor iff, for all s.t. :

 G(q)≥G(p)+G∗(p)⋅(q−p)+m||q−p||2 (13)

Local strong convexity is defined similarly.

Now suppose DM wishes to bound its loss due to misreporting by by some factor . This can be accomplished using a locally robust cost function:

###### Theorem 22

Let DM be -certain of ’s utility and fix . For any pair of decisions with non-empty decision boundary , define

 mij=maxk(eTk∇[Ui−Uj])√n2δσ;εij=σmaxk(eTk∇[Ui−Uj])√n.

Let be a convex cost function with subgradient such that, for all and any , (a) is locally robust with factor in the -neighborhood around ; (b) no other decision boundary lies within the -neighborhood around . Let DM use a consistent compensation rule based on . Assume reports to maximize her net score. Then DM’s loss in utility relative to a truthful report by is at most . If the compensation rule is uniform, the result holds with both and decreased by a factor of two.

(Sketch). The proof proceeds by cases involving the location of ’s true beliefs and the location of possible utility-maximizing misreports . W.l.o.g., assume that is in decision region . We consider four classes of misreports.

(A) Suppose . In this case, DM’s utility loss is zero since the decision is the same as if had reported truthfully.

(B) Now consider the case where is in the -neighborhood of some decision boundary . We show that any report satisfies the condition of the theorem. Let be an arbitrary point s.t.  (this must exist by the assumption that lies in the -neighborhood of ). DM’s utility loss for reporting is then bounded as follows:

 ui⋅p −uj⋅p =ui⋅q+∇Ui(p−q)−uj⋅q+∇Uj(p−q) ≤∇[Ui−Uj](p−q) ≤maxk[eTk∇(Ui−Uj)]||p−q||1 ≤maxk[eTk∇(Ui−Uj)]√n||p−q||2 ≤maxk[eTk∇(Ui−Uj)]√nεij =σ.

If , then it must be such a (i.e., be within of ), since any report in has the same inherent utility, while those closest to maximize compensation. Hence utility loss for is no greater than . Note that may lie within the neighborhood of multiple decision boundaries adjacent to , but the argument holds for any report in any such region .

(C) Now consider the case where boundary exists, but does not lie within the -neighborhood of . We show that ’s utility maximizing report cannot be in . By way of contradiction, consider a report . Let be the closed line segment ; and let be the point where intersects the decision boundary, and let be the point on on the “ side” of the boundary that is distance from the boundary. ’s loss in net score (ignoring any error due inherent utility misestimate by DM) is given by:

 S(p, p)−S(r,p) =Hp⋅p−Hr⋅p =(Hp⋅p−Hp′⋅p)+(Hp′⋅p−Hq⋅p)+(Hq⋅p−Hr⋅p)

We have by the propriety of the compensation rule (ignoring error due to inherent utility misestimation). We also have

 Hp′⋅p −Hq⋅p =Hp′⋅(p−p′+p′)−Hq⋅(p−p′+p′) =Hp′⋅p′−Hq⋅p′+Hp′⋅(p−p′)−Hq⋅(p−p′) ≥mijεij+Hp′⋅(p−p′)−Hq⋅(p−p′) ≥mijεij ≥2δ

where the first inequality holds due to the local robustness of and the second due to the convexity of and the collinearity of . Finally, we must have again due to the convexity of and the collinearity of . Thus ’s loss in compensation due to misreporting is at least (and is strictly greater if is strictly convex). But by Thm. 12 its gain in inherent utility by misreporting can be no greater than . Hence its optimal report cannot lie in .

(D) The preceding argument can be adapted in a straightforward way to the case where and are not adjacent (i.e., is empty). This result can be generalized to the case where the degree of robustness around one decision boundary is relaxed sufficiently so that the neighborhood within which can profitably misreport crosses more than one decision boundary (i.e., when another decision boundary overlaps the -neighborhood around ). Utility loss will increase but is can be bounded by considering the maximum gradient over decisions that can be swapped. The result can also be adapted to locally strongly convex cost functions in the obvious way.

###### Corollary 23

Let DM be -certain of ’s utility and fix . For any pair of decisions with non-empty decision boundary , define

 mij=maxk(eTk∇[Ui−Uj])√n2δσ;εij=σmaxk(eTk∇[Ui−Uj])√n.

Let be a convex cost function such that, for all and any , (a) is locally convex with factor in the -neighborhood around ; (b) no other decision boundary lies within the -neighborhood around . Let DM use a consistent compensation rule based on . Assume reports to maximize her net score. Then DM’s loss in utility relative to a truthful report by is at most .

These results quantify the “cost” to the decision maker of its imprecise knowledge of the expert’s utility function, i.e., its worst-case expected utility relative to what it could have achieved if it had full knowledge of ’s utility (i.e., with truthful reporting by ).

###### Remark 24

If we relax the constraint that DM choose the decision with maximum expected utility, we can exploit local robustness to induce truthful forecasts. Suppose DM uses the softmax decision policy (see footnote 8): this stochastic policy makes ’s utility continuous in its report . An analysis similar to that above, using local robustness or strong convexity of the cost function, allows DM to induce truthtelling as long as the degree of convexity compensates for the gradient of at decision boundaries. Since adding randomness to the policy removes the discontinuities in , this is now possible. Of course, this “incentive-compatibility” comes at a cost: the DM is committed to taking suboptimal actions with some probability. We defer a full analysis of the tradeoffs, and the relative benefits of “acting optimally” but risking misleading reports vs. “acting suboptimally” relative to truthful report, to a longer version of this paper.

The characterization of DM loss using local robustness or local strong convexity not only offers theoretical guarantees on DM utility—it has potential operation significance in the design of compensation rules. Specifically, it suggests an optimization procedure for designing a cost function —from which the induced compensation rule is recovered—so as to minimize DM utility loss. Intuitively, the design of will attempt to optimize two conflicting objectives: minimizing the bound on utility loss, which generally requires increasing the degree of robustness or convexity of at decision boundaries; and minimizing expected compensation which, given the requirement of strict convexity of , generally requires decreasing robustness or convexity. This tension can be addressed by either: (a) explicitly trading and off against each other in the design objective; (b) minimizing subject to a target bound ; or (c) minimizing subject to a target compensation level . The optimization itself is defined over the space of -dimensional convex curves , and could be treated as an -dimensional spline problem. The objective is to fit a convex function to a set of points with specific local curvature constraints that enforce a certain degree of local convexity at particular decision boundaries. Specific classes of spline functions (e.g., Catmull-Rom splines) might prove useful for this purpose. We leave to future research the question of the practical design of cost and compensation functions under conditions of utility uncertainty.

## 5 Market Scoring Rules

Space precludes a comprehensive treatment, but we provide a brief sketch of how one might exploit compensation functions in settings where DM aggregates the forecasts of multiple experts. One natural means of doing so is to develop a market scoring rule (MSR) [9, 11] that sequentially applies a standard scoring rule based on how an expert alters the prior forecast (see Sec. 2). The typical means of creating an MSR given a scoring rule is to have the th expert (implicitly) pay the st expert for its forecast according to , and have the principal pay only final expert for its forecast using . In this way, the principal’s total payment is bounded by the maximal possible payment to a single expert [11].

When one attempts this with self-interested experts, difficulties emerge. For instance, Shi et al. [15] show that experts who can alter the outcome distribution after making a forecast, each require compensation to prevent them from manipulating the distribution in ways that are detrimental to the principal.131313Shi et al. [15] actually use a one-round variant of an MSR. A related form of subsidy arises in our decision setting.

Following [15], we assume a collection of experts, each of whom can provide alter the forecast exactly once. Suppose the experts have an interest in DM’s decision. An “obvious” MSR in our model would simply adopt a proper compensation rule, and have each expert pay the either the compensation or the net score due to the expert who provided the incumbent forecast, and receive her payment from the next expert. If we use compensation, we run into strategic issues. With a proper compensation rule, an expert reports truthfully based on her net score (total utility), consisting of both compensation and the inherent utility of the decision she induces. In a market setting, ’s proposed decision may be changed by the next expert that provides a forecast. This (depending on her beliefs about other expert opinions) may incentivize to misreport in order to maximize her compensation rather than her net score. Overcoming such strategic issues seems challenging.

Alternatively, each expert might pay the net score due her predecessor. Unfortunately, an arbitrary proper compensation rule may not pay expert enough score to “cover her costs” (e.g., if ’s inherent utility is much higher than

’s). However, if we set aside issues associated with incentive for participation for the moment, the usual MSR approach can be adapted as follows: we fix a

single (strictly) convex cost function for all experts, and define the compensation rule for expert using in the usual way:

 Ck(p,xi)=G(p)−G∗(p)⋅p+G∗i(p)−bki,π(p),

where is ’s utility function (bias). If satisfies strong participation for all experts (i.e., if for all ), then any expert whose beliefs differ from the forecast provided by will have an expected net score (given ) greater than her expected payment to and will maximize her utility by providing a truthful forecast. In particular, let’s denote ’s expected payment to by ; then we have:

 ρ(k,k−1) =(Hp[k−1]−bk−1π(p[k−1]))⋅p[k]+bk−1π(p[k−1])⋅p[k] =Hp[k−1]⋅p[k] ≤Hp[k]⋅p[k].

Hence ’s expected payment is less than its expected net utility, leaving it with a (positive) net gain of . However, this gain may be smaller than the inherent utility she derives from the decision induced by , namely, . Hence this scheme may not incentivize participation. In cases where DM can force participation, such a scheme can be used; but in general, the self-subsidizing nature of standard MSRs cannot be exploited with self-interested experts.141414If expert utility is small relative to overall compensation, we can exploit the strong robustness (or strong convexity) of the cost function to show that experts will abstain from offering predictions only if their beliefs are sufficiently close to the incumbent prediction. Providing the degree of compensation induced by an “extremely convex” cost function can, of course, be interpreted as a form of subsidy.

To incentivize participation, DM can subsidize these payments. In the most extreme case, DM simply pays each displaced expert its net utility, which removes any incentives to misreport, but at potentially high cost. In certain circumstances, we can reduce the DM subsidy to the market by having it pay only the inherent utility (given realized outcome ) of the displaced expert , and requiring the displacing expert to pay the compensation . Under certain conditions on the relative utility of different experts for different decisions, this is sufficient to induce participation; that is, ’s net gain for partipating exceeds her inherent utility for the incumbent decision.

For instance, suppose all experts have the same utility function (e.g., consider experts in the same division of a company who are asked to predict the outcome of some event, and have different estimates, but have aligned interests in other respects). In this case, ’s net gain for reporting her true beliefs is:

 (Hp[k] −(Hp[k−1]−bπ(p[k