Probabilistic programming languages
(PPLs) are programming languages extended with statements to (1) draw values at random from a given probability distribution, and (2) perform conditioning due to observation. Probabilistic programs yield, instead of a deterministic outcome, a probability distribution over possible outcomes. PPLs greatly simplify representation of, and reasoning with rich probabilistic models. Interest in PPLs has increased in recent years, mainly in the context of Bayesian machine learning. Examples of modern PPLs includeChurch, Venture and Figaro [DBLP:conf/uai/GoodmanMRBT08, DBLP:journals/corr/MansinghkaSP14, pfeffer2009figaro], while early work goes back to Kozen [DBLP:journals/jcss/Kozen81].
is a qualitative abstraction of probability theory in which events receive discrete degrees of surprise calledranks [DBLP:books/daglib/0035277]. That is, events are ranked 0 (not surprising), 1 (surprising), 2 (very surprising), and so on, or if impossible. Apart from being computationally simpler, ranking theory permits meaningful inference without requiring precise probabilities. Still, it provides analogues to powerful notions known from probability theory, like conditioning and independence. Ranking theory has been applied in logic-based AI (e.g. belief revision and non-monotonic reasoning [DBLP:dblp_journals/ai/DarwicheP97, goldszmidt1996qualitative]) as well as formal epistemology [DBLP:books/daglib/0035277].
In this paper we develop a language called RankPL. Semantically, the language draws a parallel with probabilistic programming in terms of ranking theory. We start with a minimal imperative programming language (if-then-else, while, etc.) and extend it with statements to (1) draw choices at random from a given ranking function and (2) perform ranking-theoretic conditioning due to observation. Analogous to probabilistic programs, a RankPL programs yields, instead of a deterministic outcome, a ranking function over possible outcomes.
Broadly speaking, RankPL can be used to represent and reason about processes whose input or behavior exhibits uncertainty expressible by distinguishing normal (rank ) from surprising (rank ) events. Conditioning in RankPL amounts to the (iterated) revision of rankings over alternative program states. This is a form of revision consistent with the well-known AGM and DP postulates for (iterated) revision [DBLP:dblp_journals/ai/DarwicheP97, Gardenfors:1995:BR:216136.216138]. Various types of reasoning can be modeled, including abduction and causal inference. Like with PPLs, these reasoning tasks can be modeled without having to write inference-specific code.
The overview of this paper is as follows. Section 2 deals with the basics of ranking theory. In section 3 we introduce RankPL and present its syntax and formal semantics. In section LABEL:sec:noisy we discuss two generalized conditioning schemes (L-conditioning and J-conditioning) and show how they can be implemented in RankPL. All the above will be demonstrated by practical examples. In section LABEL:sec:implementation we discuss our RankPL implementation. We conclude in section LABEL:sec:conclusion.
2 Ranking Theory
Here we present the necessary basics of ranking theory, all of which is due to Spohn [DBLP:books/daglib/0035277]. The definition of a ranking function presupposes a finite set of possibilities and a boolean algebra over subsets of , which we call events.
A ranking function is a function that associates every possibiltiy with a rank. is extended to a function over events by defining and for each . A ranking function must satisfy .
As mentioned in the introduction, ranks can be understood as degrees of surprise or, alternatively, as inverse degrees of plausibility. The requirement that is equivalent to the condition that at least one receives a rank of 0. We sometimes work with functions that violate this condition. The normalization of is a ranking function denoted by and defined by . Conditional ranks are defined as follows.
Given a ranking function , the rank of conditional on (denoted is defined by
We denote by the ranking function defined by .
In words, the effect of conditioning on is that the rank of is shifted down to zero (keeping the relative ranks of the possibilities in constant) while the rank of its complement is shifted up to .
How do ranks compare to probabilities? An important difference is that ranks of events do not add up as probabilities do. That is, if and are disjoint, then , while . This is, however, consistent with the interpretation of ranks as degrees of surprise (i.e., is no less surprising than or ). Furthermore, ranks provide deductively closed beliefs, whereas probabilities do not. More precisely, if we say that is believed with firmness (for some ) with respect to iff , then if and are believed with firmness then so is . A similar notion of belief does not exist for probabilities, as is demonstrated by the Lottery paradox [kyburg1961probability].
Finally, note that and in ranking theory can be thought of as playing the role of and in probability, while , and play the role, respectively, of , and . Recall, for example, the definition of conditional probability, and compare it with definition 2
. This correspondence also underlies notions such as (conditional) independence and ranking nets (the ranking-based counterpart of Bayesian networks) that have been defined in terms of rankings[DBLP:books/daglib/0035277].
We start with a brief overview of the features of RankPL. The basis is a minimal imperative language consisting of integer-typed variables, an if-then-else statement and a while-do construct. We extend it with the two special statements mentioned in the introduction. We call the first one ranked choice. It has the form Intuitively, it states that either or is executed, where the former is a normal (rank 0) event and the latter a typically surprising event whose rank is the value of the expression . Put differently, it represents a draw of a statement to be executed, at random, from a ranking function over two choices. Note that we can set to zero to represent a draw from two equally likely choices, and that larger sets of choices can be represented through nesting.
The second special statement is called the observe statement It states that the condition is observed to hold. Its semantics corresponds to ranking-theoretic conditioning. To illustrate, consider the program
This program has three possible outcomes: , and , ranked 0, 1 and 2, respectively. Now suppose we extend the program as follows:
Here, the observation rules out the event , and the ranks of the remaining possibilities are shifted down, resulting in two outcomes and , ranked 0 and 1, respectively.
A third special construct is the rank expression , which evaluates to the rank of the boolean expression . Its use will be demonstrated later.
We fix a set Vars of variables (ranged over by ) and denote by Val the set of integers including (ranged over by ). We use , and to range over the numerical expressions, boolean expressions, and statements. They are defined by the following BNF rules:
We omit parentheses and curly brackets when possible and define in terms of and . We write instead of , and abbreviate statements of the form to . Note that the skip statement does nothing and is added for technical convenience.
The denotational semantics of RankPL defines the meaning of a statement as a function that maps prior rankings into posterior rankings. The subjects of these rankings are program states represented by valuations, i.e., functions that assign values to all variables. The initial valuation, denoted by , sets all variables to 0. The initial ranking, denoted by , assigns 0 to and to others. We denote by the valuation equivalent to except for assigning to .
From now on we associate with the set of valuations and denote the set of rankings over by . Intuitively, if is the degree of surprise that is the actual valuation before executing , then is the degree of surprise that is the actual valuation after executing . If we refer to the result of running the program , we refer to the ranking . Because might not execute successfully, is not a total function over . There are two issues to deal with. First of all, non-termination of a loop leads to an undefined outcome. Therefore is a partial function whose value is defined only if terminates given . Secondly, observe statements may rule out all possibilities. A program whose outcome is empty because of this is said to fail. We denote failure with a special ranking that assigns to all valuations. Since , we define the range of by . Thus, the semantics of a statement is defined by a partial function from to .
But first, we define the semantics of expressions. A numerical expression is evaluated w.r.t. both a ranking function (to determine values of rank expressions) and a valuation (to determine values of variables). Boolean expressions may also contain rank expressions and therefore also depend on a ranking function. Given a valuation and ranking , we denote by the value of the numerical expression w.r.t. and , and by the set of valuations satisfying the boolean expression w.r.t. . These functions are defined as follows.111We omit explicit treatment of undefined operations (i.e. division by zero and some operations involving ). They lead to program termination.
Given a boolean expression we will write as shorthand for and as shorthand for . We are now ready to define the semantics of statements. It is captured by seven rules, numbered D1 to D7. The first deals with the skip statement, which does nothing and therefore maps to the identity function.
The meaning of is the composition of and .
The rank of a valuation after executing an assignment is the minimum of all ranks of valuations that equal after assigning the value of to .
To execute we first execute and conditional on and , yielding the rankings and . These are adjusted by adding the prior ranks of and and combined by taking the minimum of the two. The result is normalized to account for the case where one branch fails.
Given a prior , the rank of a valuation after executing is the minimum of the ranks assigned by and , where the latter is increased by . The result is normalized to account for the case where one branch fails.
The semantics of corresponds to conditioning on the set of valuations satisfying , unless the rank of this set is or the prior ranking equals .
We define the semantics of as the iterative execution of then until the rank of is (the loop terminates normally) or the result is undefined ( does not terminate). If neither of these conditions is ever met (i.e., if the while statement loops endlessly) then the result is undefined.
where is defined by .
Some remarks. Firstly, the semantics of RankPL can be thought of as a ranking-based variation of the Kozen’s semantics of probabistic programs [DBLP:journals/jcss/Kozen81] (i.e., replacing with and with ). Secondly, a RankPL implementation does not need to compute complete rankings. Our implementation discussed in section LABEL:sec:implementation follows a most-plausible-first execution strategy: different alternatives are explored in ascending order w.r.t. rank, and higher-ranked alternatives need not be explored if knowing the lowest-ranked outcomes is enough, as is often the case.
Consider the two-bit full adder circuit shown in figure LABEL:fig:adder. It contains two XOR gates , two AND gates and an OR gate . The function of this circuit is to generate a binary representation of the number of inputs among , , that are high. The circuit diagnosis problem is about explaining observed incorrect behavior by finding minimal sets of gates that, if faulty, cause this behavior.222See Halpern [DBLP:books/daglib/0014219, Chapter 9] for a similar treatment of this example.