Revising Incompletely Specified Convex Probabilistic Belief Bases

We propose a method for an agent to revise its incomplete probabilistic beliefs when a new piece of propositional information is observed. In this work, an agent's beliefs are represented by a set of probabilistic formulae -- a belief base. The method involves determining a representative set of 'boundary' probability distributions consistent with the current belief base, revising each of these probability distributions and then translating the revised information into a new belief base. We use a version of Lewis Imaging as the revision operation. The correctness of the approach is proved. The expressivity of the belief bases under consideration are rather restricted, but has some applications. We also discuss methods of belief base revision employing the notion of optimum entropy, and point out some of the benefits and difficulties in those methods. Both the boundary distribution method and the optimum entropy method are reasonable, yet yield different results.

There are no comments yet.

Authors

• 8 publications
• 14 publications
• 8 publications
02/27/2013

Ignorance and the Expressiveness of Single- and Set-Valued Probability Models of Belief

Over time, there have hen refinements in the way that probability distri...
08/01/2018

Imaginary Kinematics

We introduce a novel class of adjustment rules for a collection of belie...
09/24/2020

On the use of evidence theory in belief base revision

This paper deals with belief base revision that is a form of belief chan...
04/07/2016

On Stochastic Belief Revision and Update and their Combination

I propose a framework for an agent to change its probabilistic beliefs w...
02/27/2013

Generating New Beliefs From Old

In previous work [BGHK92, BGHK93], we have studied the random-worlds app...
05/05/2014

Belief revision in the propositional closure of a qualitative algebra (extended version)

Belief revision is an operation that aims at modifying old beliefs so th...
03/27/2013

Kutato: An Entropy-Driven System for Construction of Probabilistic Expert Systems from Databases

Kutato is a system that takes as input a database of cases and produces ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Generalized Imaging

It is not yet universally agreed what revision means in a probabilistic setting. One school of thought says that probabilistic expansion is equivalent to Bayesian conditioning. This is evidenced by Bayesian conditioning () being defined only when , thus making expansion equivalent to revision. In other words, one could define expansion (restricted revision) to be

 bBCα={(w,p)∣w∈W,p=b(w∣α),b(α)≠0}.

To accommodate cases where , that is, where contradicts the agent’s current beliefs and its beliefs need to be revised in the stronger sense, we shall make use of imaging. Imaging was introduced by Lewis (1976) as a means of revising a probability function. It has also been discussed in the work of, for instance, Gärdenfors (1988); Dubois and Prade (1993); Chhogyal et al. (2014); Rens and Meyer (2015). Informally, Lewis’s original solution for accommodating contradicting evidence is to move the probability of each world to its closest, -world. Lewis made the strong assumption that every world has a unique closest -world. More general versions of imaging allows worlds to have several, equally proximate, closest worlds.

Gärdenfors (1988) calls one of his generalizations of Lewis’s imaging general imaging. Our method is also a generalization. We thus refer to his as Gärdenfors’s general imaging and to our method as generalized imaging to distinguish them. It should be noted that all three these imaging methods are general revision methods and can be used in place of Bayesian conditioning for expansion. “Thus imaging is a more general method of describing belief changes than conditionalization,” (Gärdenfors, 1988, p. 112).

Let be the set of -worlds closest to with respect to pseudo-distance . Formally,

 Min(α,w,d):= {w′∈[α]∣∀w′′∈[α],d(w′,w)≤d(w′′,w)},

where is some pseudo-distance measure between worlds (e.g., Hamming or Dalal distance).

Example 1.

Let the vocabulary be . Let be . Suppose is Hamming distance. Then

 Min((q∧r)∨(q∧¬r∧s),111,d)={111} Min((q∧r)∨(q∧¬r∧s),110,d)={110} Min((q∧r)∨(q∧¬r∧s),101,d)={101} Min((q∧r)∨(q∧¬r∧s),100,d)={110,101} Min((q∧r)∨(q∧¬r∧s),011,d)={111} Min((q∧r)∨(q∧¬r∧s),010,d)={110} Min((q∧r)∨(q∧¬r∧s),001,d)={101} Min((q∧r)∨(q∧¬r∧s),000,d)={110,101}

Definition 2 (GI).

Then generalized imaging (denoted ) is defined as

 bGIα:={(w,p)∣w∈W,p=0 if w∉[α], else p=∑w′∈Ww∈Min(α,w′,d)b(w′)/|Min(α,w′,d)|}.

In words, is the new belief state produced by taking the generalized image of with respect to . Notice how the probability mass of non--worlds is shifted to their closest -worlds. If a non--world with probability has closest -worlds (equally distant), then each of these closest -worlds gets mass from .

We define so that we can write , where is a revision operator.

Example 2.

Continuing on Example 1: Let .

is abbreviated as .

.

.

.

And .

Revision via GI and boundary belief states

Perhaps the most obvious way to revise a given belief base (BB) is to revise every individual belief state in and then induce a new BB from the set of revised belief states. Formally, given observation , first determine a new belief state for every via the defined revision operation:

 ΠBα={bα∈Π∣bα=bGIα,b∈ΠB}.

If there is more than only a single belief state in , then contains an infinite number of belief states. Then how can one compute ? And how would one subsequently determine from ?

In the rest of this section we shall present a finite method of determining . What makes this method possible is the insight that can be represented by a finite set of ‘boundary’ belief states – those belief states which, in a sense, represent the limits or the convex hull of . We shall prove that the set of revised boundary belief states defines . Inducing from is then relatively easy, as will be seen.

Let be every permutation on the ordering of worlds in . For instance, if , then , , , , . Given an ordering , let be the -th element of ; for instance, . Suppose we are given a BB . We now define a function which, given a permutation of worlds, returns a belief state where worlds earlier in the ordering are assigned maximal probabilities according to the boundary values enforced by .

Definition 3.

is the such that for , , if , then .

Example 3.

Suppose the vocabulary is and . Then, for instance, , , , , , , , , , .

Definition 4.

We define the boundary belief states of BB as the set

 ΠBbnd:={b∈ΠB∣ W#∈Wperm,b=MaxASAP(B,W#)}

Note that .

Example 4.

Suppose the vocabulary is and . Then

 ΠB1bnd = {{(11,1.0),(10,0.0),(01,0.0),(00,0.0)}, {(11,0.0),(10,1.0),(01,0.0),(00,0.0)}, {(11,0.6),(10,0.0),(01,0.4),(00,0.0)}, {(11,0.6),(10,0.0),(01,0.0),(00,0.4)}, {(11,0.0),(10,0.6),(01,0.4),(00,0.0)}, {(11,0.0),(10,0.6),(01,0.0),(00,0.4)}}.

Next, the revision operation is applied to every belief state in . Let .

Example 5.

Suppose the vocabulary is and . Let be . Then

 (ΠB1bnd)GIα = {{(11,0.0),(10,0.5),(01,0.5),(00,0.0)}, {(11,0.0),(10,1.0),(01,0.0),(00,0.0)}, {(11,0.0),(10,0.3),(01,0.7),(00,0.0)}, {(11,0.0),(10,0.6),(01,0.4),(00,0.0)}, {(11,0.0),(10,0.8),(01,0.2),(00,0.0)}}.

(Two revision operations produce .)

To induce the new BB from , the following procedure is executed. For every possible world, the procedure adds a sentence enforcing the upper (resp., lower) probability limit of the world, with respect to all the revised boundary belief states. Trivial limits are excepted.

For every , , where , except when , and , where , except when .

The intention is that the procedure specifies to represent the upper and lower probability envelopes of the set of revised boundary belief states – thus defines the entire revised belief state space (cf. Theorem 1).

Example 6.

Continuing Example 5, using the translation procedure just above, we see that , , , .

Note that if we let , , then .

Example 7.

Suppose the vocabulary is and . Let be . Then

 ΠB2bnd = {{(11,0.9),(10,0),(01,0),(00,0.1)}, {(11,0),(10,0.9),(01,0),(00,0.1)}, {(11,0),(10,0),(01,0.9),(00,0.1)}},
 (ΠB2bnd)GIα = {{(11,0),(10,0),(01,0.9),(00,0.1)}, {(11,0),(10,0),(01,0),(00,1)}} and

, , , .

Note that if we let , , then .

Let be a partition of such that is a block in iff . Denote an element of block as , and the block of which is an element as . Let , in other words, the superscript in indicates the size of . Let .

Observation 1.

Let be positive integers such that iff . Let be values in such that . Associate with every a maximum value it is allowed to take: . For every , we define the assignment value

 av(νi):={most(νi)% if ∑ik=1≤11−∑i−1k=1otherwise

Determine first , then and so on. Then

 av(ν1)δ1+⋯+av(νm)δm>ν′1δ1+⋯+ν′mδm

whenever for some .

For instance, let , , , . Let , , , . Then , , , and

 0.51+0.32+0.23+04=0.716.

But

 0.491+0.32+0.23+0.014=0.709.

And

 0.51+0.292+0.23+0.014=0.714.

Lemma 1 essentially says that the belief state in which causes a revised belief state to have a maximal value at world (w.r.t. all belief states in ), will be in .

Lemma 1.

For all , is in .

Proof.

Note that

 ∑w′∈Ww∈Min(α,w′,d)b(w′)/|Min(α,w′,d)|

can be written in the form

 ∑w′∈[w1]w∈Min(α,w′,d)b(w′)1+⋯+∑w′∈[wm]w∈Min(α,w′,d)b(w′)m.

Observe that there must be a such that . Then by the definition of the set of boundary belief states (Def. 4), will assign maximal probability mass to , then to and so on.

That is, by Observation 1, for some , for all . Therefore, is in . ∎

Let

 ¯¯¯xw:=maxb∈ΠBbndb(w) ¯¯¯¯¯Xw:=maxb∈ΠBb(w) ¯¯¯yw:=maxb∈(ΠBbnd)GIαb(w) ¯¯¯¯Yw:=maxb∈(ΠB)GIαb(w) x––w:=minb∈ΠBbndb(w) X––w:=minb∈ΠBb(w) y–w:=minb∈(ΠBbnd)GIαb(w) Y––w:=minb∈(ΠB)GIαb(w)

Lemma 2 states that for every world, the upper/lower probability of the world with respect to is equal to the upper/lower probability of the world with respect to . The proof requires Observation 1 and Lemma 1.

For all , and .

Proof.

Note that if , then and .

We now consider the cases where .

 ¯¯¯yw=¯¯¯¯Yw

iff

 maxb∈(ΠBbnd)b(w)=maxb∈(ΠB)b(w)

iff

if

, where

and

Note that

 ∑w′∈Ww∈Min(α,w′,d)b(w′)/|Min(α,w′,d)|

can be written in the form

 ∑w′∈[w1]w∈Min(α,w′,d)b(w′)1+⋯+∑w′∈[wm]w∈Min(α,w′,d)b(w′)m.

Then by Observation 1, is in . And also by Lemma 1, the belief state in identified by must be the one which maximizes

where . That is, .

With a symmetrical argument, it can be shown that . ∎

In intuitive language, the following theorem says that the BB determined through the method of revising boundary belief states captures exactly the same beliefs and ignorance as the belief states in which have been revised. This correspondence relies on the fact that the upper and lower probability envelopes of can be induce from , which is what Lemma 2 states.

Theorem 1.

Let . Let be the BB induced from . Then .

Proof.

We show that , .

() implies , (by definition of ). Lemma 2 states that for all , and . Hence, , Therefore, .

() implies , . Hence, by Lemma 2, , . Therefore, by definition of , . ∎

Revising via a Representative Belief State

Another approach to the revision of a belief base (BB) is to determine a representative of (call it ), change the representative belief state via the the defined revision operation and then induce a new BB from the revised representative belief state. Selecting a representative probability function from a family of such functions is not new (Goldszmidt, Morris, and Pearl, 1990; Paris, 1994, e.g.). More formally, given observation , first determine , then compute its revision , and finally induce from .

We shall represent (and thus ) by the single ‘least biased’ belief state, that is, the belief state in with highest entropy:

Definition 5 (Shannon Entropy).
 H(b):=−∑w∈Wb(w)lnb(w),

where is a belief state.

Definition 6 (Maximum Entropy).

Traditionally, given some set of distributions , the most entropic distribution in is defined as

 bH:=argmaxb∈ΠH(b).

Suppose . Then the belief state satisfying the constraints posed by for which is maximized is , , , .

The above distribution can be found directly by applying the principle of maximum entropy: The true belief state is estimated to be the one consistent with known constraints, but is otherwise as unbiased as possible, or “Given no other knowledge, assume that everything is as random as possible. That is, the probabilities are distributed as uniformly as possible consistent with the available information,”

(Poole and Mackworth, 2010). Obviously world 00 must be assigned probability 0.1. And the remaining 0.9 probability mass should be uniformly spread across the other three worlds.

Applying to on evidence results in , , , .

Example 8.

Suppose the vocabulary is , and is . Then , , , . Applying to on results in , , , . can be translated into as .

Still using , notice that . But how different are and , , , ? Perhaps one should ask, how different is from the representative of : The least biased belief state satisfying is . That is, How different are and ?

In the case of , we could compare , , , with , , , . Or if we take the least biased belief state satisfying , we can compare , , , with , , , .

It has been extensively argued (Jaynes, 1978; Shore and Johnson, 1980; Paris and Vencovská, 1997) that maximum entropy is a reasonable inference mechanism, if not the most reasonable one (w.r.t. probability constraints). And in the sense that the boundary belief states method requires no compression / information loss, it also seems like a very reasonable inference mechanism for revising BBs as defined here. Resolving this misalignment in the results of the two methods is an obvious task for future research.

Future Directions

Some important aspects still missing from our framework are the representation of conditional probabilistic information such as is done in the work of Kern-Isberner, and the association of information with its level of entrenchment. On the latter point, when one talks about probabilities or likelihoods, if one were to take a frequentist perspective, information observed more (less) often should become more (less) entrenched. Or, without considering observation frequencies, an agent could be designed to have, say, one or two sets of deeply entrenched background knowledge (e.g., domain constraints) which does not change or is more immune to change than ‘regular’ knowledge.

Given that we have found that the belief base resulting from revising via the boundary-belief-states approach differs from the belief base resulting from revising via the representative-belief-state approach, the question arises, When is it appropriate to use a representative belief state defined as the most entropic belief state of a given set ? This is an important question, especially due to the popularity of employing the Maximum Entropy principle in cases of undespecified probabilistic knowledge (Jaynes, 1978; Goldszmidt, Morris, and Pearl, 1990; Hunter, 1991; Voorbraak, 1999; Kern-Isberner, 2001; Kern-Isberner and Rödder, 2004) and the principle’s well-behavedness (Shore and Johnson, 1980; Paris, 1994; Kern-Isberner, 1998).

Katsuno and Mendelzon (1991) modified the eight AGM belief revision postulates (Alchourrón, Gärdenfors, and Makinson, 1985) to the following six (written in the notation of this paper), where is some revision operator.333In these postulates, it is sometimes necessary to write an observation as a BB, i.e., as – in the present framework, observations are regarded as certain.

• .

• If is satisfiable, then .

• If is satisfiable, then is also satisfiable.

• If , then .

• .

• If is satisfiable, then .

Testing the various revision operations against these postulates is left for a sequel paper.

An extended version of maximum entropy is minimum cross-entropy (MCE) (Kullback, 1968; Csiszár, 1975):

Definition 7 (Minimum Cross-Entropy).

The ‘directed divergence’ of distribution from distribution is defined as

 R(c,b):=∑w∈Wc(w)lnc(w)b(w).

is undefined when while ; when , , because . Given new evidence , the distribution satisfying diverging least from current belief state is

 argminc∈Π,c⊩ϕR(c,b).
Definition 8 (MCI).

Then MCE inference (denoted ()) is defined as

 bMCIα:=argminb′∈Π,b′⊩(α)=1R(b′,b).

In the following example, we interpret revision as MCE inference.

Example 9.

Suppose the vocabulary is and . Let be . Then

 ΠB1bnd = {{(11,1.0),(10,0.0),(01,0.0),(00,0.0)}, {(11,0.0),(10,1.0),(01,0.0),(00,0.0)}, {(11,0.6),(10,0.0),(01,0.4),(00,0.0)}, {(11,0.6),(10,0.0),(01,0.0),(00,0.4)}, {(11,0.0),(10,0.6),(01,0.4),(00,0.0)}, {(11,0.0),(10,0.6),(01,0.0),(00,0.4)}},
 (ΠB1bnd)MCIα = {{(11,0),(10,0),(01,1),(00,0)}, {(11,0),(10,1),(01,0),(00,0)}, {(11,0),(10,0.6),(01,0.4),(00,0)}} and

, .

Note that if we let , then .

Recall from Example 6 that included . Hence, in this particular case, combining the boundary belief states approach with results in a less informative revised belief base than when is used. The reason for the loss of information might be due to and