It seems natural to found a theory of rational decision making on the notion of preference; after all, what is deciding other than choosing between alternatives?
This must have been the idea behind the early works on the subject, starting from von Neumann and Morgenstern’s neumann1947 , to the analytical framework of Anscombe and Aumann anscombe1963 and to that of Savage savage1954 . They show that a rational decision maker, let us call him Thomas,111A homage to Bayes.
can be regarded as an agent with beliefs about the world, in the form of probabilities, and values of consequences, in the form of utilities. And moreover that Thomas can be regarded as taking decisions by maximising his expected utility. This view has had a tremendous impact in many fields of research, not last on Bayesian statistics, which some see to draw its justification from Savage’s work.
Yet, many authors, including von Neumann, Morgenstern and Anscombe themselves, have soon recognised that it is not realistic to assume that Thomas can always compare alternatives; in some cases he may just not know which one to prefer—even only because he lacks information about them. In this case we talk of incomplete, or partial, preferences. Axiomatisations of rational decision making with incomplete preferences came much later, though, through the works of Bewley bewley1986 , Seidenfeld et al. seidenfeld1995 , Nau nau2006 , and, more recently, Ok et al. ok2012 , and Galaabaatar and Karni galaabaatar2013 . These works build upon the analytical framework of Anscombe and Aumann so as to represent rational (or coherent) preferences through sets of expected utilities; the disagreeing decisions these may lead to account for the incomplete nature of preferences.
The picture that comes out of these works is not entirely clear. The axioms employed are not always the same. This is the case of the continuity axiom, called Archimedean, which is necessary to obtain a representation in terms of expected utilities; but it is also the case for the state-independence axioms that enable one to decompose the set of expected utilities into separate sets of probabilities and utilities. The cardinality of the spaces involved also changes in different works, and the treatment of infinite spaces turns out to be quite technical while not void of problems.
Moreover, the overall impression is that directly stretching the axioms of Anscombe and Aumann, so as to deal with incomplete preferences, shows some limits, and that clarity risks getting lost in the process.
In parallel, other researchers were getting, through another path, to a theory of imprecise probability. It started from de Finetti’s interpretation of expectation as a subject’s fair price for a gamble, that is, a bounded random variable. Let us call this subject Matilda, or Thilda, to stress that she is Thomas’ counterpart. De Finetti’s next bright move was to deduce probability by imposing a single axiom of self-consistency on Thilda’s fair prices for different gambles finetti1937 . Smith suggested that de Finetti’s approach could be extended to account for the case that probabilities are indeterminate or not precisely specified smith1961 . Williams made Smith’s ideas precise, by giving an axiomatisation that is again based on a notion of self-consistency, which is called coherence since then williams1975 .
It is important to stop for a moment on this notion, which will be also central to the present work. Williams developed his theory starting from a setup more primitive than de Finetti’s. Rather than asking Thilda to assess her prices for gambles, he only requires Thilda to state whether a gamble isdesirable (or acceptable) for her,222These are also called favourable gambles in seidenfeld1990 . in the sense that she would commit herself to accept whatever reward or loss it will eventually lead to. The core notion in Williams’ theory is then a set of so-called desirable gambles. One such set is said coherent when it satisfies a few axioms of rationality. Lower and upper expectations, which are called previsions in Williams’ theory, and their properties, are derived from the set of gambles, and are shown to be equivalent to sets of probabilities. Eventually, we can also recover de Finetti’s theory as a special case from sets of gambles called maximal, or complete. But the important point is that coherent sets of desirable gambles can be conditioned, marginalised, extended to bigger spaces, and so on, without ever needing to talk about (sets of) probabilities. This is even more remarkable as coherent sets of desirable gambles are more expressive than probabilities; for instance, we can condition a set on an event of zero probability without making any conceptual or technical issue arise. In fact, sets of probabilities are equivalent to a special type of desirability, the one made of so-called strictly desirable gambles. We take care to introduce desirability and coherent lower previsions in a self-contained way, and as pedagogically as possible in a research paper, in Section 2. We do so since we are aware that it is not a formalism that is as well known as that of preference relations (these are briefly introduced in Section 3).
Williams’ fundamental work went largely unnoticed until Walley used it as the basis for his theory of imprecise probability walley1991 . At the very essence, Walley’s theory can be regarded as Williams’ theory with an additional axiom to account for conglomerability. Conglomerability finetti1931 is a property that a finitely additive model may or may not satisfy, and that makes a difference when the space of possibilities is infinite. In fact, if the possibility space is finite, William’s and Walley’s theory essentially coincide.
Walley’s theory has been influential, originating a number of further developments as well as specialisations and applications (see augustin2014 for a recent collection of related works). Most importantly, along the years, the core notion of desirability underlying both Williams’ and Walley’s theories, has resisted thorough analysis and has proven to be a very solid and general foundation for a behavioural theory of uncertainty. On the other hand, let us note that both Williams’ and Walley’s theories are developed for the case of a linear utility scale, and the utility itself is assumed precisely specified.
What stroke us at first is that the same mathematical structure is at the basis of both the axiomatisations of incomplete preferences and the theory of desirability in imprecise probability: that of convex cones.
In the former case, cones are made of scaled differences of horse lotteries. If we call the possibility space and the set of possible outcomes, or prizes, then a horse lottery, or act, is a nested, or compound, lottery that returns a simple lottery on for each possible realisation of . In this paper we take to be finite—whereas is unconstrained—, so is simply a mass function on . The convex cone associated with a (strict, as we take it) coherent preference relation between horse lotteries is given by . Note that is made of objects that are neither acts nor preferences—strictly speaking.
On the other hand, a coherent set of desirable gambles turns out to be again a convex cone, this time made of gambles. In the traditional formulation of desirability, a gamble is a bounded function from the space of possibility to the real numbers ; represents the (possibly negative) reward one gets, in a linear utility scale, in case occurs.
Obviously, the two representations are very close. Cannot they just be the same? The answer is yes: in Section 4 we show that desirability cones and cones arising from incomplete preferences are equivalent representations. Two conditions must be met for this to be the case:
One is very specific: the preference relation must have a worst act, that is, an act such that for all . This is almost universally assumed in the literature, together with the presence of a best act. In this paper we need not have the best act, the worst act suffices to develop all the theory. Moreover, we show that we can assume without loss of generality that the worst act is degenerate on an element for all . In other words, that set has a worst outcome . Whence we denote by both the worst outcome and the worst act. Accordingly, we shall use to denote a set of prizes with worst outcome and a set of prizes without it (or something that is not specified).
The second condition is that desirability must be extended to make it deal with vector-valued gambles. As we shall see in Remark 4, this is surprisingly simple to do. It is enough to define a gamble as a function . The interpretation is that once occurs, gamble returns a vector of rewards, one for each different type of prize in . And yet, we eventually, and mathematically, treat the gamble as if the possibility space were the product . Therefore the desirability axioms are unchanged, they are simply applied to gambles defined on . All the theory is applied unchanged to gambles on the product space. The consequence are, however, stunning.
They follow in particular because then we can go from a problem of incomplete preferences to an equivalent one of desirability by simply dropping from and hence from all acts. Immediately, cone becomes in that way a coherent set of desirable gambles on . And we can go the other way around: we start from a coherent set of desirable gambles on and by appending a worst outcome to , and to all the acts, we create an equivalent cone , with the associated coherent preference relation . What is striking is that the equivalence holds also for the corresponding notions of Archimedeanity: an Archimedean preference relation originates a coherent set of strictly desirable gambles, and vice versa. And in desirability there are well-consolidated tools to derive, and work with, lower previsions (expectations); these tools now become available to the decision-theoretic investigator.
On the other hand, our standpoint is that the fundamental tool to model incomplete preferences is cone : it is more expressive that the sets of probabilities we can derive from it: it allows us to model and work with non-Archimedean problems besides Archimedean ones. And we can do this with well-established tools that apply directly to the cone, without any need to go through a probability-utility representation. This remarkably widens the set of problems we can handle.
A variety of results and insights follow from the mentioned equivalence. We list them in the order they appear in the paper.
We show in Section 4 that the traditional Archimedean axiom conflicts with the possibility to represent a maximally uninformative, or vacuous, relation. Thanks to the worst outcome, we define a weak Archimedean condition, which is like the original one but restricted to a subset of non-trivial preferences. This solves the problem and still suffices to obtain a representation in terms of sets of expected utilities.
Weak Archimedeanity enables us also to address the long-term controversy about the regularity assumption, that is, whether or not a probabilistic model should be allowed to assign probability zero to events: probabilities can be zero in our model and yet we can have a meaningful representation of a strict partial order in terms of a set of expected utilities.
Finally, weak Archimedeanity allows the probabilistic models derived in our representation to be meaningfully defined on uncountable sets, unlike in the case of the traditional Archimedean axiom. This is remarked in Section 5.
Still in Section 5 we discuss how desirable gambles might be taken as the primitive notion at the basis of preference, with an opportune interpretation.
We illustrate also how our representation in terms of expectations is naturally based on objects that we can interpret as joint probability-utility functions (e.g., joint mass functions in the finite case). This enables us to use tools from probability theory to deal with, and reinterpret, operations with utility, in a uniform way.
Given that the presence of the worst act is important for this paper, we investigate in Section 6 whether one can extend, in a least-committal way, a coherent preference relation that has no worst act, into one that has it. It turns out that it is always possible to find such a minimal extension, but there is a catch: the extension is never (weakly) Archimedean irrespectively of the preference relation we start from—apart from the trivial empty relation; moreover, that the notion of minimal extension is ill-defined for Archimedean extensions: given any Archimedean extension, it is always possible to find a smaller one.
Then we dive deeper into this problem, and define a strong Archimedean condition, which is indeed stronger than the traditional one. We show that the weak, strong and traditional Archimedean conditions are all essentially333For the weak one this is only partly true. equivalent in case a relation has a worst outcome. When it has not, and restricting the attention to the case of finite spaces (in particular finite ), we show that the strong Archimedean condition leads to an Archimedean extension and that the traditional does not suffice. Moreover, we show that strong Archimedeanity is equivalent to the topological openness of cone .
Starting from Section 7, we assume that the worst outcome exists and we take advantage of the desirability representation of incomplete preferences to revisit a number of traditional notions.
Initially we discuss the cases of state dependence and independence directly for the case of desirable gambles. We show that there are much weaker (and arguably, more intuitive) notions than the traditional ones we can employ to model state independence.
Something similar happens with the case of complete preferences. The definition in the case of desirability is straightforward and of great generality.
In Section 8, we see what happens of these notions when we move down to the level of coherent lower previsions, that is, sets of expected utilities. Also in this case, we provide weaker notions than the traditional ones that are very direct.
For the case of complete preferences, we give a number of equivalent conditions to impose them, thus also simplifying the traditional conditions, and showing that we can use the very same condition both for the case of complete beliefs and incomplete values and for the opposite one (the so-called Knightian uncertainty).
In Section 9 we consider two axioms used in the literature to impose state independence in the case of a multiprior expected multiutility representation. After quite an involved analysis, we show that imposing those axioms is equivalent to model state independence with sets of expectations using the strong product. In other words, for complete preferences, state independence simply turns out to be stochastic independence in our setting; and to a set of stochastically independent models in the case of incomplete preferences. In fact, when we say, as above, that there are weaker ways to model state independence, this is because it is well known with sets of probabilities that there are much weaker ways to model irrelevance and independence than the strong product.
In Section 10 we argue that the Archimedean condition is inadequate to capture all the problems that can be tackled using sets of expected utilities. In particular, there are problems that can be modelled using collections of sets of expected utilities that are not Archimedean according to the common definition. We give a new definition of full Archimedeanity that captures all and only the problems that can be expressed with collections of sets of expected utilities, and of which the Archimedean condition is a special case.
In Section 11 we compare our work with some previous ones that have dealt with incomplete preferences.
We consider the work by Nau nau2006 in the light of the connection we make to desirability, showing how some of his notions map into ours and vice versa. Nau’s work is based on weak preference relations; this also gives us the opportunity to discuss how the present approach can be adapted to that case.
The work by Galaabaatar and Karni galaabaatar2013 is particularly interesting to compare as it has been a source of inspiration for ours, and because it is actually quite close in spirit, even though it misses the connection to desirability.
Finally, we consider the work by Seidenfeld, Schervish and Kadane seidenfeld1995 . This work is interesting to compare especially for the different type of setting it is based on, compared to ours and to the former ones. In particular, they use a special type of Archimedean condition with a topological structure. The paper is also based on quite technical mathematical tools. Moreover, they work with sets of expected utilities that need not be open or closed, thus gaining generality compared to the former approaches. Interestingly, they also consider the problem of extending a preference relation to a worst (and a best) act and come to conclusions that seem to be clashing with ours.
We clarify the difference with Seidenfeld et al.’s work by mapping their concepts in our language of desirability. By doing so we show that there is no contradiction between their results and ours, and we argue that the type of generality they get to can be achieved more naturally and simply using convex cones of gambles rather than sets of expected utilities.
In summary, we present a very general approach to axiomatise, and work with, incomplete preferences, which to us appears simpler and with a great potential to clarify previous notions and to unify them under a single viewpoint. It is based on a shift of paradigm: regarding desirability as the underlying and fundamental concept at the basis of preference.
There are limitations in our current work, like the finiteness of , and other challenges left to address. We comment on these and other issues in the Conclusions. The Appendix collects the proofs of our results.
2 Desirability and coherent lower previsions
2.1 Foundations of desirability
Let denote the set of possible outcomes of an experiment. In this paper we let the cardinality of be general, so can be infinite. We call the space of possibilities. A gamble is a bounded, real-valued, function of . It is interpreted as an uncertain reward in a linear utility scale: in particular, is the amount of utiles a subject receives if eventually happens to be the outcome of the experiment. Let us name this subject Thilda.
We can model Thilda’s uncertainty about through the set of gambles she is willing to accept. We say also that those are her acceptable or desirable gambles (we use the two terms interchangeably). Accepting a gamble means that Thilda commits herself to receive whatever occurs. Since can be negative, Thilda can lose utiles and hence the desirability of a gamble depends on Thilda’s beliefs about .
Denote by the set of all the gambles on and by its subset of the positive gambles: the non-negative non-zero ones (the set of negative gambles is similarly given by . We denote these sets also by and , respectively, when there can be no ambiguity about the space involved. Thilda examines a set of gambles and comes up with the subset of the gambles in that she finds desirable. How can we characterise the rationality of the assessments represented by ?
We can follow the procedure adopted in similar cases in logic, where first of all we need to introduce a notion of deductive closure: that is, we first characterise the set of gambles that Thilda must find desirable as a consequence of having desired in the first place. This is easy to do since Thilda’s utility scale is linear, whence those gambles are the positive linear combinations of gambles in :
On the other hand, we must consider that any gamble in must be desirable as well, given that it may increase Thilda’s utiles without ever decreasing them. Stated differently, the set plays the role of the tautologies in logic. This means that the actual deductive closure we are after is given by the following:
(Natural extension for gambles) Given a set of desirable gambles, its natural extension is the set of gambles given by
Note that is the smallest convex cone that includes .
The rationality of the assessments is characterised through the natural extension by the following:
(Avoiding partial loss for gambles) A set of desirable gambles is said to avoid partial loss if .
This condition is the analog of the notion of consistency in logic. The irrationality of a natural extension that incurs partial loss depends on the fact (as it is possible to show) that it must contain a negative gamble , that is, one that cannot increase utiles and can possibly decrease them. In contradistinction, a set that avoids partial loss does not contain negative gambles.
There is a final notion that is required to make a full logical theory of desirability. This is the logical notion of a theory, that is, a set of assessments that is consistent and logically closed, in the sense that the consistent assessments coincide with their deductive closure in the examined domain . This means that Thilda is fully aware of the implications of her assessments on other assessments in . The logical notion of a theory goes in desirability under the name of coherence:
Definition 3 (Coherence for gambles).
Say that is coherent relative to if avoids partial loss and (and hence ). In case then we simply say that is coherent.
This definition alone, despite its conceptual simplicity, makes up all the theory of desirable gambles: in principle, every property of the theory can be derived from it. Moreover, the definition gives the theory a solid logical basis and in particular guarantees that the inferences one draws are always coherent with one another. At the same time, the theory is very powerful: as we have seen, it can be defined on any space of possibility and any domain (in this sense, it is not affected by measurability problems); and, as we shall make precise later on in this section, it can handle both precise and imprecise assessments, as well as model both Archimedean and non-Archimedean problems.
Sets of desirable gambles are uncertainty models and as such we can define a notion of conditioning for them. As usual, we consider an event . We adopt de Finetti’s convention to denote by both the subset of and its indicator function (that equals one in and zero elsewhere). Using this convention, we can multiply and a gamble obtaining the new gamble given by
for all . Recall the interpretation of a gamble as an uncertain reward. Since gamble cannot change Thilda’s wealth outside , we can as well interpret as a gamble that is called off unless the outcome of the experiment belongs to : we say that is a gamble contingent, or conditional, on . This leads to the following:
Definition 4 (Conditioning for gambles).
Let be a coherent set of desirable gambles on and be a non-empty subset of . The set of desirable gambles conditional on derived from is defined as .
is a set of desirable gambles coherent relative to . Note that there is a natural correspondence between and the restriction of to , whence we can also put in relation with , and show that this is a coherent set of desirable gambles in . The point here is that and are equivalent representations of the conditional set; yet, there can be some mathematical convenience in using one over the other depending on the situation at hand. For these reasons, and in the attempt to avoid a cumbersome notation, from now on—with few exceptions—we shall use notation for the conditional set, even though, on occasions, what we shall actually mean and use is . This abuse should not be problematic as the specific set we shall use will be clear from the context. For analogous reasons, in the following we shall sometimes abuse terminology by just saying that is coherent, without specifying ‘relative to ’.
Finally, it is useful to consider the operation of marginalisation for a coherent set of desirable gambles.
Definition 5 (Marginalisation for gambles).
Let be a coherent set of desirable gambles in the product space , where are two logically independent sets. The marginal set of desirable gambles on induced by is defined as . The marginal on is defined analogously.
is a coherent set of desirable gambles relative to . collects all gambles that depend only on elements of ; we also say they are the -measurable gambles. Since is made of -measurable gambles, we can establish a correspondence between and . is a coherent set of gambles in . Similarly to the discussion made above in the case of conditioning, there is no real difference in representing the marginal information via or , so from now on we shall stick to notation for the marginal set, even when we shall actually mean .
2.2 Coherent sets of desirable gambles and coherent lower previsions
When we restrict the attention to the case where , coherence can by characterised by four simple conditions:
is a coherent set of desirable gambles on if and only if it satisfies the following conditions:
[Accepting Partial Gains];
[Avoiding Null Gain];
This result goes back to Williams williams1975 and Walley walley1991 (for a recent proof, see (miranda2010c, , Proposition 2).) It shows, somewhat more explicitly than Definition 3, that a coherent set of desirable gambles is a convex cone (c, d) that excludes the origin (b) and that contains the positive gambles (a).
A coherent set of desirable gambles implicitly defines a probabilistic model for . The way to see this is to consider gambles of the form , where is a real value used as a constant gamble here, and is any gamble. Say that Thilda is willing to accept the gamble . We can reinterpret this by saying that she is willing to buy gamble at price . Focusing on the supremum price for which this happens leads us to the following:
Definition 6 (Coherent lower and upper previsions).
Let be a coherent set of desirable gambles on . For all , let
it is called the lower prevision of . The conjugate value given by is called the upper prevision of . The functionals are respectively called a coherent lower prevision and a coherent upper prevision.
It is not difficult to see that an upper prevision can be written also as
which makes it clear that it is Thilda’s infimum selling price for gamble . That buying and selling prices for some goods usually do not coincide is a matter of fact in real problems; this shows that the ability to represent such a situation is important if we aim at doing realistic applications of probability. In case they do coincide, instead, what we get are linear previsions:
Definition 7 (Linear prevision).
Let be a coherent set of desirable gambles on and the induced coherent lower and upper previsions. If for some , then we call the common value the prevision of and we denote it by . If this happens for all then we call the functional a linear prevision.
Now we would like to give a word of caution and of clarification: before proceeding, it should be crystal clear that linear previsions are nothing else but expectations. In particular, they are expectations with respect to the probability that is the restriction of to events: the probability of an event is just . Traditionally, one takes probability as the primitive concept, which is created on top of some structure such as a -algebra, and then computes the expectation of a measurable gamble (i.e., a measurable bounded random variable) . Here instead probability is derived from expectation that is derived from a coherent set of desirable gambles. Among other advantages of this approach, one is that we do not need structures as -algebras to do our analysis since the probability underlying can be finitely additive; whence one can in principle also compute the prevision (expectation) of a non-measurable gamble. In this sense is more general and fundamental than a traditional expectation; it is actually an expectation in de Finetti’s sense; it seems worth remarking this by using de Finetti’s name for it: that of prevision, and to use symbol to denote it. Besides this, using symbol is mathematically accurate once probability is defined as the prevision of indicator functions, and it avoids us to keep on switching, in a needless way, between symbols and .
In turn, coherent lower and upper previsions are just lower and upper expectation functionals. Consider a coherent lower prevision . We can associate it with a set of probabilities by considering all the linear previsions that dominate :
that turns out to be closed444In the weak* topology, which is the smallest topology for which all the evaluation functionals given by , where , are continuous. and convex. Since each linear prevision is in a one-to-one correspondence with a finitely additive probability, we can regard also as a set of probabilities, which is called a credal set. Moreover, is the lower envelope of the previsions in :
In fact, coherent lower previsions are in a one-to-one correspondence with closed and convex sets of probabilities, such as . The coherent upper prevision is, not surprisingly, the upper envelope of the same previsions; as a consequence, it follows that for all . In any case, and even if it is convenient sometimes to work with coherent upper previsions, let us remark that it is enough to work with coherent lower previsions in general, thanks to the conjugacy relation between them.
It is well known that a functional is a coherent lower prevision if and only if it satisfies the following three conditions for all and all real :
(When condition c holds with equality for every , then is actually a linear prevision.) Therefore one can understand these three conditions also as the axioms of coherent lower previsions, thus disregarding the more primitive notion of desirability. Still, it is useful to know that coherent lower previsions are in a one-to-one correspondence with a special class of desirable gambles:
Definition 8 (Strict desirability).
A coherent set of gambles is said strictly desirable if it satisfies the following condition:
Strict desirability555A note of caution to prevent confusion in the reader: the adjective ‘strict’ denotes two unrelated things in desirability and in preferences. In preferences it characterises irreflexive (i.e., non-weak) relations, while in desirability it formalises an Archimedean condition as it will become clear in Section 4. We are keeping the same adjective in both cases for historical reasons and given that there should be no possibility to create ambiguity by doing so. is a condition of openness: it means that the part of cone represented by does not contain the topological border. By an abuse of terminology, is said to be open too.
2.3 Conditional lower previsions and non-Archimedeanity
We have just seen that coherent lower previsions and coherent sets of strictly desirable gambles are equivalent models. This means also that coherent sets of desirable gambles are more general than coherent lower previsions, given that the general case of desirability does not impose any constraint on the topological border (such as openness). We can rephrase this by saying that coherent sets of desirable gambles can model also non-Archimedean problems, that is, problems that cannot be modelled by probabilities (and in particular through a coherent lower prevision).
There are two main avenues where non-Archimedeanity can show up in desirability and both are related to gambles with zero prevision. The first has to do with the much debated problem about the way to deal with conditioning in case of zero-probability events. To see this, it is useful first to define a conditional coherent lower prevision.
Definition 9 (Conditional coherent lower and upper previsions).
Let be a coherent set of desirable gambles on and a non-empty subset of . For all , let
be the conditional lower prevision of given . The conjugate value given by is called the conditional upper prevision of . The functionals are respectively called a conditional coherent lower prevision and a conditional coherent upper prevision.
Note that Eq. (2) is the special case of Eq. (5) obtained when , whence for all matters we can stick to Eq. (5) as the general procedure to obtain (conditional) coherent lower previsions from coherent sets of desirable gambles. Note, on the other hand, that ; here we have denoted by the restriction of to and by its unconditional lower prevision obtained from set . The equality of the two lower previsions implies that is equivalent to a set of probabilities and that it satisfies conditions similar to a–c for all and all real :
in addition to the condition, specific to the conditional case, that (this could be removed by formulating everything using rather than ). As in the unconditional case, one could take these four requirements as axioms of coherent conditional lower previsions, thus disregarding desirability. And also in this case, if we do start from a coherent conditional lower prevision , we can then induce its associated set of strictly desirable gambles through
Note that, according to Definition 4, is made of gambles that are zero outside . For the rest, the above expression simply states what is desirable under : either the positive gambles or the net gains originated by buying a gamble at price , which Thilda regards as convenient since the price is less than her supremum acceptable one.
Eq. (5) tells us how to create coherent conditional lower previsions from a coherent set of desirable gambles. If we apply it in particular to a coherent set of strictly desirable gambles , then, thanks to its equivalence to a coherent lower prevision , we obtain a conditioning rule defined directly for coherent lower previsions:
Definition 10 (Conditional natural extension).
Let be a coherent lower prevision and a non-empty subset of . The conditional natural extension of given is the real-valued functional
defined for every , where is a conditional linear prevision defined by Bayes’ rule.
In other words, is obtained by conditioning all the linear previsions in by Bayes’ rule, when , and then taking their lower envelope. When , is instead vacuous, and the interval is equal to for all , whence it is completely uninformative about Thilda’s beliefs when the conditioning event has zero lower probability. This is just a limitation of an Archimedean model such as a coherent lower prevision.
In contrast, it is known that the conditional lower prevision obtained from a coherent set of non-strictly desirable gambles can be informative, and actually for every pair with that are coherent with each other in Walley’s sense, we can find a coherent set of desirable gambles that induces them both (see (walley1991, , Appendix F4)). This is to say that conditioning with events of zero probability does not pose any problem in the framework of desirability. This happens thanks to the rich modelling capabilities offered by the border of the cone, which is excluded from consideration in the case of strictly desirable sets. Note that there are many common situations that we would like to model where is assigned zero lower probability and posterior beliefs are not vacuous: just think of a bivariate normal density function over ; it assigns zero probability to each real number but conditional on a real number it is again Gaussian, whence non-vacuous. These cases fall in the area of general desirability.
The previous question, related to conditioning on an event of probability zero, has illustrated the first type of non-Archimedeanity that a coherent set of desirable gambles can address. Still, it is possible to model the same case through probabilities: the key is to use a collection of coherent lower previsions as the basic modelling tool, such as the pair , rather than a single unconditional one. However, there is another, somewhat purer, type of non-Archimedeanity that cannot be modelled by collections either and that can instead be modelled through desirability. Here is an example (taken from (zaffalon2013a, , Example 13)):
Two people express their beliefs about a fair coin using coherent sets of desirable gambles. The possibility space , represents the two possible outcomes of tossing the coin, i.e., heads and tails. For the first person, the desirable gambles are characterised by ; for the second person, a gamble is desirable if either or . Call and the set of desirable gambles for the first and the second person, respectively. It can be verified that both sets are coherent. Moreover, they originate the same unconditional and conditional lower previsions through Eqs. (2) and (5). In the unconditional case we obtain ; this corresponds, correctly, to assigning probability to both heads and tails. In the conditional case, we again correctly obtain that each person would assign probability 1 to either heads or tails assuming that one of them indeed occurs: . This exhausts the conditional and unconditional lower previsions that we can obtain from and , given that has only two elements. It follows that and are indistinguishable as far as probabilistic statements are concerned. But now consider the gamble , which yields a loss of 1 unit of utility if the coin lands heads and a gain of 1 unit otherwise: whereas is not desirable for the first person, it is actually so for the second. This distinction of the two persons’ behaviour cannot be achieved through probabilities—and in fact gamble lies in the border of each of the two sets, given that .
The same example can be rephrased in the language of preferences (see (miranda2010c, , Example 10)). It shows that coherent sets of desirable gambles can determine a preference also when the lower expectation of the related gamble, in the case above , is zero, which is a clear case of a non-Archimedean preference. Again, the extra expressive power of general desirability compared to strict desirability is made possible by the modelling capabilities offered by the border of the involved cones.
All the discussion above on non-Archimedeanity shows in particular that with coherent sets of desirable gambles we need not enter the controversy as to whether or not we should use the regularity assumption, which prescribes that probabilities of possible events should be positive. It is instructive to track the origin of this assumption: it goes back to an important article by Shimony shimony1955 . In the language of this paper, Shimony argued—correctly, in our view— that de Finetti’s framework could lead to the questionable non-acceptance of a positive gamble in case zero probabilities where present, given that the prevision (expectation) of such a gamble could be zero. This led Shimony and a number of later authors, among whom Carnap and Skyrms, to advocate strengthening de Finetti’s theory by requiring regularity. But it has originated also much controversy given the very constraining nature of regularity on probabilistic models. Between requiring regularity or dropping the acceptability of positive gambles, it eventually emerged a third option: that of using non-Archimedean models, which can keep both desiderata together. Unfortunately, this idea has not been the subject of much development in mainstream probability. But there are signs that something is changing and that there could be a renewed interest in non-Archimedean models (see for instance Pedersen’s very recent interesting work pedersen2014 ). In this light, it is remarkable that Williams has elegantly solved these problems by desirability as long as 40 years ago: in fact, by including the positive gambles in the natural extension, no matter what Thilda’s assessments are, as in Eq. (1), we make sure that they, and their implications, are always desirable; and we do this without compromising the presence of zero probabilities, therefore not requiring regularity. What we get is a theory, very much in the spirit of de Finetti’s, that is very powerful, so much that it can smoothly deal with non-Archimedeanity too.
An important difference between de Finetti’s theory and desirability is that the former is developed for precise probabilistic assessments. We can restrict desirability to precise models by an additional simple axiom of maximality:
Definition 11 (Maximal coherent set of gambles).
A coherent set of desirable gambles is called maximal if
Requiring maximality is a tantamount to assuming that Thilda has complete preferences. The logic counterpart of maximality is also called the completeness of a theory. It is interesting to consider that logic has discovered long ago the inevitability of incomplete theories, after Gödel’s celebrated theorem, and this has led logicians to eventually appreciate their modelling power. Mainstream (Bayesian) probability and statistics, on the other hand, for the most part seem to be still stuck on precise probabilistic models; and yet these are complete logical theories too.
Geometrically, a coherent maximal, or complete, set of desirable gambles
is a cone degenerated into a hyperplane. It induces, via Eq. (1), a linear prevision ; this, in turn, induces through Eq. (4) a coherent set of strictly desirable gambles that corresponds to the interior of .666Remember that by an abuse of terminology we say that a coherent set of strictly desirable gambles is open; for this same reason we refer to the union of with the interior of as ‘the interior’ of . Therefore there is a one-to-one correspondence between the interiors of maximal coherent sets and linear previsions. This connects again to the question of Archimedeanity. A maximal coherent set of gambles is a richer model than the associated coherent lower prevision, given that the former can profit from the border of the cone. Therefore, for example, it can yield a non-vacuous conditional linear prevision even when the conditioning event has precise zero probability; in contrast, the linear prevision that corresponds to the interior of the set will lead to a vacuous conditional model in that case.
2.4 Conglomerability and marginal extension
Finally, it is important for this paper to say something also about conglomerability. In fact, conditions a–d essentially make up Williams’ theory of desirability williams1975 . The competing theory by Walley walley1991 adds them a fifth condition that depends on the choice of a partition of the possibility space:
[Conglomerability],777Note that we should call it -conglomerability, but we drop given that in this paper we shall always consider only one partition of the space.
This axiom follows from additivity when is finite. The rationale behind a is that if Thilda is willing to accept gamble conditional on , and this holds for all , then she should also be willing to accept unconditionally.
Despite the innocuous-looking nature of a, conglomerability has originated almost a century-long controversy after de Finetti discovered it finetti1930 . De Finetti described conglomerability in the case of previsions and it can be shown that a reduces to such a formulation when we induce a linear prevision from a coherent and conglomerable set of desirable gambles. The controversy concerns whether or not conglomerability should be imposed as a rationality axiom. De Finetti rejects this idea; others, like Walley, support it. In some recent work we have shown that there are cases where conglomerability is necessary for a probabilistic theory to make coherent inferences in time zaffalon2013a . In any case, it is not our intention to enter the controversy in this paper and actually we shall try to avoid having to deal with questions of conglomerability as much as possible.
Definition 12 (Conglomerable natural extension).
The conglomerable natural extension plays the role of the deductive closure for a conglomerable theory of probability. Although the natural extension is easy to compute walley1991 , this is not necessarily the case for the conglomerable natural extension miranda2012 ; miranda2015 . However, it will be simple in the context of this paper.
In fact, we shall use conglomerability jointly with a special type of hierarchical information: we shall consider a marginal coherent set of gambles and conditional coherent sets for all . It is an interesting fact then that the conglomerable natural extension of the given marginal and conditional information always exists and can easily be represented as follows (for a proof, see (miranda2012, , Proposition 29)):
Proposition 2 (Marginal extension).
Let be a marginal coherent set of gambles in and , for all , be conditional coherent sets. Let
be a set that conglomerates all the conditional information along the partition of that we denote, by an abuse of notation, as too. Then the -conglomerable natural extension of and () is called their marginal extension and is given by
The marginal extension is a generalisation of the law of total expectation to desirable gambles. It was initially defined for lower previsions in (walley1991, , Section 7.7.2) and later extended to deal with more than two spaces in miranda2007 ; in the previous form for desirable gambles it has appeared in (miranda2012, , Section 7.1). Representing marginal extension through desirability allows one to take advantage of the increased expressiveness of the model; we can for instance condition our marginal extension on an event with zero (lower or upper) probability and obtain an informative model.
As we have said, the marginal extension can be defined also for lower previsions. To this end, we first need to introduce a way to conglomerate the conditional lower previsions defined over a partition of :
Definition 13 (Separately coherent conditional lower prevision).
Let be a partition of and a coherent lower prevision conditional on for all . Then we call
a separately coherent conditional lower prevision.
For every gamble , is the gamble on that equals for ; so it is a -measurable gamble.
Secondly, we introduce the notion of marginalisation for coherent lower previsions similarly to the case of desirability:
Definition 14 (Marginal coherent lower prevision).
Let be a coherent lower prevision on . Then the -marginal coherent lower prevision it induces is given by
for all that are -measurable. The definition of is analogous.
In other words, the -marginal is simply the restriction of to the subset of gambles in that only depend on elements of . For this reason, and analogously to the case of desirability, we can represent the -marginal in an equivalent way also through the corresponding lower prevision defined on . In the following we shall not distinguish between and and rather use the former notation in both cases.
We are ready to define the marginal extension:
Definition 15 (Marginal extension for lower previsions).
Consider the possibility space and its partition . We shall denote this partition by and its elements by , with an abuse of notation. Let be a marginal coherent lower prevision and let be a separately coherent conditional lower prevision on . Then the marginal extension of is the lower prevision given by
for all .
The marginal extension is the least-committal coherent lower prevision with marginal that is coherent with , in the sense that there is a coherent and conglomerable set of desirable gambles that induces both and via Eq. (5).
Remark 1 (On dynamic consistency).
It is useful to observe that the marginal extension is tightly related to the question of dynamic consistency in decision problems (analogous considerations hold for marginal extension in the case of sets of desirable gambles). This is a concept originally highlighted by Machina machina1989 and also related to the work of Hammond hammond1988 . Loosely speaking, a decision problem is dynamically consistent if the optimal strategy does not change from the normal to the extensive form. In the context of this paper, an uncertainty model is understood as dynamically consistent if it coincides with the least-committal (that is, weakest) combination of the marginal and conditional information it induces.
For example, assume that we have a joint model on the product space given by a credal set . We derive from a set of marginal previsions on , and a family of sets of conditional previsions , one for each element of . Then dynamic consistency means that we can recover by taking the closed convex hull of set
This seems to be what Epstein and Schneider called ‘rectangularity’ in epstein2003b . Note in particular that both the marginal linear prevision and each conditional linear prevision are free to vary in their respective sets in (8) irrespectively of the other linear previsions; in other words, there are no ‘logical’ ties between linear previsions in different credal sets. This is the essential feature that characterises a dynamically consistent model. In terms of lower previsions, if we let be the coherent lower previsions determined by the sets by means of lower envelopes, as in (3), dynamic consistency means that we should have , that is, should correspond to a procedure of marginal extension.
Note that dynamic consistency, as described above, depends on the notion of ‘weakest combination’ of marginal and conditional information. This notion may vary, thus yielding different dynamically consistent models, even though they induce the same marginal and conditional information. For instance, when independence enters the picture, the form of the weakest combination may depend on the notion of independence adopted. We shall discuss more about this point in Section 8.1 (see Remark 5).
3 Preference relations
Let denote, as before, the space of possibilities. In order to deal with preferences, we introduce now another set of outcomes, or prizes. While the cardinality of is unrestricted, in this paper we take to be a finite set. Moreover, we assume that all the pairs of element in are possible or, which is equivalent, that and are logically independent.
The treatment of preferences relies on the basic notion of a horse lottery:
Definition 16 (Horse lottery).
We define a horse lottery as a functional such that is a probability mass function on for all .
Let us denote by the set of all horse lotteries on . Horse lotteries will also be called acts for short. In the following we shall use the notation for the set of all the acts in case there is no possibility of ambiguity. An act for which it holds that for all is called a von Neumann-Morgenstern lottery; moreover, if such a is degenerate on an element , then it is called a constant von Neumann-Morgenstern lottery and is denoted by the symbol : that is, for all .
A horse lottery is usually regarded as a pair of nested lotteries: at the outer level, the outcome of an experiment is employed to select the simple lottery ; this is used at the inner level to determine a reward . Horse lotteries can be related to a behavioural interpretation through a notion of preference. The idea is that a subject, that this time we shall name Thomas, who aims at receiving a prize from , will prefer some acts over some others; this will depend on his knowledge about the experiment originating an , as well as on his attitude towards the elements of . We consider the following well-known axioms of coherent preferences.
Definition 17 (Coherent preference relation).
A preference relation over horse lotteries is a subset of . It is said coherent if it satisfies the next two axioms:
[Strict Partial Order];
If also the next axiom is satisfied, then we say that the coherent preference relation is Archimedean.
Next, we recall a few results that we shall use in the paper. We omit their proofs, which are elementary.
Suppose that for given it holds that for some . Then for any coherent preference relation it holds that
Note that using the previous proposition one can also easily show that
Let the set of scaled differences of horse lotteries be defined by
We shall also denote this set simply by in the following, if there is not a possibility of confusion. Preference relation is characterised by the following subset of :
as the next proposition remarks:
Let be a coherent preference relation. Then for all and , it holds that
Moreover, has a specific geometrical structure:
Let be a coherent preference relation. Then is a convex cone that excludes the origin; it is empty if and only if so is .
It turns out that cones and coherent preference relations are just two ways of looking at the same thing.
There is a one-to-one correspondence between coherent preference relations and convex cones in .
4 Equivalence of desirability and preference
In this section we shall show how coherent preferences on acts can equivalently be represented through coherent sets of desirable gambles and vice versa. The key to establish the relation is the notion of worst outcome.
4.1 Worst act, worst outcome
It is customary in the literature to assume that a coherent preference relation comes with a best and a worst act, which are defined as follows:
Definition 18 (Best and worst acts).
Let be a coherent preference relation and . The relation has a best act if for all and, similarly, it has a worst act if for all .
In this paper, however, we shall not be concerned with the best act.
A special type of worst act is one that is degenerate on the same element for all (it is therefore a constant von Neumann-Morgenstern lottery). We call it the worst outcome and we denote it by as well:
Definition 19 (Worst outcome).
Let be a coherent preference relation. The relation has a worst outcome if for all .
Now we wonder whether it is restrictive to assume that a coherent relation has a worst outcome compared to assuming that it has a worst act. To this end, we start by characterising the form of the worst act:
Let be a coherent preference relation with worst act . Then for each , there is such that .
In other words, a worst act has to be surprisingly similar to a worst outcome: a worst act tells us that for each there is always an element that is the worst possible; the difference is that such a worst element is not necessarily the same for all as it happens with the worst outcome. But at this point it is clear that we can reformulate the representation in such a way that we can work with relation as if it had a worst outcome (we omit the trivial proof):
Consider a coherent preference relation with a worst act . Let be any element in . Define the bijection that, for all , does nothing if and otherwise swaps the probabilities of outcomes and :
for all . Application induces a relation by:
Then it holds that is a coherent preference relation for which is the worst outcome. Moreover, relation is Archimedean if so is relation .
In other words, if our original relation has a worst act , we can always map it to a new relation, on the same product space , for which is the worst outcome and such that we can recover the original preferences from those of the new relation. This means that there is no loss of generality in assuming right from the start that .
In the rest of the paper (with the exception of Section 6) we shall indeed assume that a coherent preference relation has a worst outcome. This will turn out to be enough to develop all the theory.
Remark 2 (Notation for the worst outcome).
Given the importance of the worst outcome for this paper, it is convenient to define some notation that is tailored to it. In particular, when we want to specify that the set of acts contains , then we shall denote it by ; otherwise we shall simply denote it by . In the latter case, it can either be that the set does not contain it or that the statements we do hold irrespectively of that. The distinction will be clear from the context. Note in particular that when the two sets are used together, the relation between them will always be that . Moreover, we let , besides the usual , and we use them accordingly.
Remark 3 (Assumption about the worst outcome).
Note, in addition, that if , then it is necessary that and hence the only possible relation on is the empty one. We skip this trivial case in the paper by assuming throughout that contains at least two elements.
When relation has the worst outcome, we can associate with another, equivalent, set. As it turns out, the new set will be one of desirable gambles. To this end, we first define the projection operator that drops the -components from an act:
Definition 20 (Projection operator).
Consider a set of outcomes that includes . The projection operator is the functional
defined by , for all and all .
In this paper, we are going to use this operator to project horse lotteries in , or scaled differences of them, into gambles on . Although is not injective in general, both these restrictions, which we shall denote by , are. As a consequence, we shall denote by the restriction of the projection operator to , and then define its inverse
Similarly, we shall denote by the restriction of to , and define its inverse as
It is then an easy step to show the following:
Let be a coherent preference relation on and be defined by
is a coherent set of desirable gambles on .
Remark 4 (On preference through desirability).
Despite this is a technically simple result, it has important consequences; it is worth stopping here one moment to consider them. Proposition 9 gives us tools to analyse, and draw inferences about, preference by means of desirability. Nowadays desirability is well understood as a tool for uncertain modelling more primitive than probability. This offers us the opportunity to take a fresh new look at preference from the perspective of desirability. We shall exploit such an opportunity in the rest of the paper.
On another side, let us stress that it is important to correctly interpret the set of desirable gambles in (11). Set is made of gambles from ; but only is actually a space of possibilities here, that is, the set made up of all the possible outcome of an uncertain experiment. This marks a significant departure from the traditional definition of a set of desirable gambles, which would require in this case that , and not just , was the space of possibilities. The way to interpret an element of is then as a vector-valued gamble: for each , is the vector of associated rewards. We shall detail this interpretation in Section 5. In any case, we shall omit the specification ‘vector-valued’ from now on and refer more simply to the elements of as gambles.
The projection operator gives us also the opportunity to define a notion of the preferences on which any rational subject should agree and that for this reason we call objective:
Definition 21 (Objective preference).
Given acts we say that is objectively preferred to if and . We denote objective preference by .
The idea is that is objectively preferred to because the probability assigns to outcomes, except for the outcome (which is not one any subject actually wants), is always not smaller than that of while being strictly greater somewhere. An objective preference is indeed a preference:
Let be a coherent preference relation on and . Then .
This proposition remarks in a formal way that objective preferences are those every rational subject expresses. These preferences are therefore always belonging to any coherent preference relation. If they are the only preferences in the relation, then we call the relation vacuous because Thomas is expressing only a non-informative, or trivial, type of preferences.
As surprising as it may seem, the vacuity of the coherent preference relation turns out to be incompatible with the Archimedean axiom, except in trivial cases:
Let be the vacuous preference relation on , so that . Then is Archimedean if and only if and .
It is important to remark that this incompatibility is not something we have to live with: for it is possible to define a weaker version of the Archimedean axiom that does not lead to such an incompatibility and that is based on restricting the attention to non-trivial preferences—that is, to non-objective ones. We can also simplify the axiom by focusing only on ternary relations such as , which enables us also to skip the usual symmetrised version of the axiom. The result is the following:
Definition 22 (Weak Archimedean condition).
Let be a coherent preference relation on . We say that the relation is weakly Archimedean if it satisfies
( [weak Archimedeanity].
The relation between the traditional Archimedean condition and the weak one is best seen through the following:
In other words, C retains the overall structure of the traditional condition C, while focusing only on the non-trivial preferences. We shall discuss more about the weak Archimedean condition and its importance for the expressive power of a decision-theoretic representation in Section 5.3.1.
Next we detail the relation between the weak Archimedean condition for preferences and strict desirability for gambles.
The following result completes the picture: it turns out that coherent sets of desirable gambles and coherent preferences are just two equivalent representations, and this is true even when Archimedeanity is taken into consideration.
Let be given by Eq. (9). For any gamble it holds that
There is a one-to-one correspondence between elements of and gambles in .
Let be a coherent set of desirable gambles on . Then defines a convex cone that is equivalent to a coherent preference relation on for which is the worst outcome. Moreover, if is a set of strictly desirable gambles, then the relation is weakly Archimedean.
5 What happened?
It is useful to consider in retrospect what we have achieved so far.
5.1 Desirability foundations of rational decision making
The analysis in the previous sections can be summarised by the following:
There is a one-to-one correspondence between coherent preference relations on for which is the worst outcome and coherent sets of desirable gambles on . Moreover the relation is weakly Archimedean if and only if the set of gambles is strictly desirable. In this case they give rise to the same representation in terms of a set of linear previsions.
In other words, from the mathematical point of view, it is completely equivalent to use coherent sets of desirable gambles on in order to represent coherent preference relations on . Or, to say it differently, we have come to realise that there is no need to distinguish two subjects, Thomas and Thilda, to make our exposition; accordingly, from now on we shall refer to the subject that makes the assessments more simply as ‘you’. That the two theories are just the same is even more the case if we consider that we can define a notion of preference directly on top of a coherent set of desirable gambles (walley1991, , Section 3.7.7):
Definition 23 (Coherent preference relation for gambles).
Given a coherent set of desirable gambles on and gambles , we say that is preferred to in if , and denote this by .
In other words, if you are willing to give away in order to have .
It is trivial to show that the notion of preference we have just described is equivalent to the preference relation on horse lotteries that corresponds to : this means that if and only if for all such that for some . This is also the reason why we use the same symbol in both cases.
All this suggests that we can think of establishing the foundations of rational decision making using the desirability of gambles as our primitive conceptual tool, rather than regarding it as derived from preferences over horse lotteries. In this new conceptual setting, you would be asked to evaluate which gambles you desires in and then we would judge the rationality of your assessments by checking whether or not the related set of gambles is coherent according to a–d. This appears to be utterly natural and straightforward, and yet to make this procedure possible, it is essential that the numbers making up a gamble are given a clear interpretation, so as to put you in the conditions to make meaningful and accurate assessments of desirability.
Remember that any gamble is such that , for some and . Therefore for all and , is a number proportional to the increase (or decrease if it is negative) of the probability to win prize in state as a consequence of exchanging for . We can use this idea to interpret more directly, without having to rely on horse lotteries, in the following way.
Imagine that there is a simple lottery with possible prizes , all of them with the same, large, number of available tickets.888It is possible to assume that lottery tickets are infinitely divisible so as to cope with non-integer numbers; we shall neglect these technicalities in the description and just assume that their number is very large to make things simpler. Tickets are sequentially numbered irrespectively of their type. Eventually only one number will be drawn according to a uniform random sampling; and this you know. Therefore the (objective, or physical) probability that you win prize is proportional to the number of -tickets you own. This implies that your utility scale is linear in the number of -tickets for all . You can increase, but also decrease, the number of your lottery tickets by taking gambles on an unrelated experiment, with possible outcomes in , about which you are uncertain: by accepting gamble , you commit yourself to accept in case occurs; this is a vector proportional to the number of lottery tickets you will receive for each (this means also to have to give away -tickets when is negative).999Thanks to the linearity of your utility scale, we can assume without loss of generality that you initially own a positive number of -tickets for all ; for the same reason, the size of the positive proportionality constant is irrelevant. It follows that we can reduce the number of tickets you receive or give away so as to keep all your probabilities of winning prizes between 0 and 1. In case you accept more than one gamble (and in any case finitely many of them), the reasoning is analogous; it is enough to take into account that the tickets received or given away will be proportional to the sum of the accepted gambles.
Stated differently, a gamble is interpreted as an uncertain reward in lottery tickets. Since lottery tickets eventually determine the probability to win prizes, such a probability is treated here as a currency. The idea of a ‘probability currency’ appears to go back to Savage savage1954 and Smith smith1961 and has already been used also to give an interpretation of gambles in Walley’s theory (walley1991, , Section 2.2.2). The difference between the traditional idea of probability currency and the one we give here rests on the possible presence of more than one prize, which is allowed in the present setting. That the traditional concept of probability currency can be extended to our generalised setting is ensured by Theorem 15.
With this interpretation in mind, it is possible to state and deal with a decision-theoretic problem only in terms of gambles on . This has a number of implications:
One is that we can profit now from all the theory already developed for desirable gambles in order to study decision-theoretic problems in quite a generality. We shall actually do so in the rest of the paper.
Another is that we should be careful to interpret gambles properly, as we said already in Remark 4; in particular, despite our gambles are on , only is the space of possibilities—which is something that marks a departure from the original theory of desirable gambles. The most important consequence here is that it is meaningless to think of the element of as events that may occur; only elements of can. As a consequence, we can still use conditioning as an updating rule, under the usual interpretations and restrictions; but we can do so only conditional on subsets of . Note, on the other hand, that in the current framework we can update both beliefs and values.
Finally, our results so far appear to have considerable implications for Williams’ and Walley’s behavioural theories of probability and statistics: in fact, the current reach of these, otherwise very powerful and well-founded, theories is limited to precise and linear utility. By extending gambles so as to make them deal with multiple prizes, we are de facto laying the foundations for their extension to imprecise non-linear utility. This should allow, with time, a whole new range of problems to be addressed in a principled way by those theories.
5.3 A new way to represent utility (values)
There is a feature of our proposed representation of decision-theoretic problems that is worth discussing even only to make the question explicit, as it seems to be different from past approaches.
Let us start by considering the simplest setup to make things very clear: consider finite (as well as , as usual) and assume that the assessments eventually yield a coherent set of strictly desirable gambles corresponding to the interior of a maximal set. This means that is in a one-to-one correspondence with a joint mass function on that is the restriction of a linear prevision to the elements of the product space. is a joint mass function in the sense that for all and . Now, if we assume that for all , then becomes an equivalent representation of .
Moreover, we can rewrite using total probability (marginal extension) as the product of the marginal mass function on and of the conditional mass function on given :
for all . Here the meaning of is that of the probability of , whereas represents the utility of outcome conditional on the occurrence of .
The important, and perhaps unconventional, point in all this discussion is that in our approach the utility function is just another mass function, that is, one such that for all . This means that the utility mass function is formally equivalent to a probability mass function, it is only our interpretation that changes: while the numbers in the probability function represent occurrence probabilities, in the utility function they represent mixture coefficients that allow us to compare vectors of (amounts of) outcomes. The more traditional view of the utilities is that of numbers between zero and one that do not need to add up to one.
An advantage of our representation is that we can directly exploit all the machinery and concepts already in place for probabilities also for utilities. For example, the notion of state independence trivially becomes probabilistic independence through (12), and this happens whenever for all it holds that