1 Introduction
Action languages, such as [Gelfond and Lifschitz (1993)], [Gelfond and Lifschitz (1998)], [Giunchiglia and Lifschitz (1998)], + [Giunchiglia et al. (2004)], and [Lee and Meng (2013)], are formalisms for describing actions and their effects. Many of these languages can be viewed as highlevel notations of answer set programs structured to represent transition systems. The expressive possibility of action languages, such as indirect effects, triggered actions, and additive fluents, has been one of the main research topics. Most of such extensions are logicoriented, and less attention has been paid to probabilistic reasoning, with a few exceptions such as [Baral et al. (2002), Eiter and Lukasiewicz (2003)], let alone automating such probabilistic reasoning and learning parameters of an action description.
Action language + [Babb and Lee (2015)], one of the most recent additions to the family of action languages, is no exception. While the language is highly expressive to embed other action languages, such as + [Giunchiglia et al. (2004)] and [Lee et al. (2013)]
, it does not have a natural way to express the probabilities of histories (i.e., a sequence of transitions).
In this paper, we present a probabilistic extension of +, which we call +. Just like + is defined as a highlevel notation of answer set programs for describing transition systems, + is defined as a highlevel notation of programs—a probabilistic extension of answer set programs. Language + inherits expressive logical modeling capabilities of + but also allows us to assign a probability to a sequence of transitions so that we may distinguish more probable histories.
We show how probabilistic reasoning about transition systems, such as prediction, postdiction, and planning problems, can be modeled in + and computed using an implementation of . Further, we show that it can be used for probabilistic abductive reasoning about dynamic domains, where the likelihood of the abductive explanation is derived from the parameters manually specified or automatically learned from the data.
The paper is organized as follows. Section 2 reviews language and multivalued probabilistic programs that are defined in terms of . Section 3 presents language +, and Section 4 shows how to use + and system lpmln2asp [Lee et al. (2017)] to perform probabilistic reasoning about transition systems, such as prediction, postdiction, and planning. Section 5 extends + to handle probabilistic diagnosis.
2 Preliminaries
2.1 Review: Language
We review the definition of from [Lee and Wang (2016)], limited to the propositional case. An program is a finite set of weighted rules where is a propositional formula, is a real number (in which case, the weighted rule is called soft) or for denoting the infinite weight (in which case, the weighted rule is called hard).
For any program and any interpretation , denotes the usual (unweighted) ASP program obtained from by dropping the weights, and denotes the set of in such that , and denotes the set . The unnormalized weight of an interpretation under is defined as
The normalized weight (a.k.a. probability) of an interpretation under is defined as
Interpretation is called a (probabilistic) stable model of if . The most probable stable models of are the stable models with the highest probability.
2.2 Review: MultiValued Probabilistic Programs
Multivalued probabilistic programs [Lee and Wang (2016)] are a simple fragment of that allows us to represent probability more naturally.
We assume that the propositional signature is constructed from “constants” and their “values.” A constant is a symbol that is associated with a finite set , called the domain. The signature is constructed from a finite set of constants, consisting of atoms ^{1}^{1}1Note that here “=” is just a part of the symbol for propositional atoms, and is not equality in firstorder logic. for every constant and every element in . If the domain of is then we say that is Boolean, and abbreviate as and as .
We assume that constants are divided into probabilistic constants and nonprobabilistic constants. A multivalued probabilistic program is a tuple , where

PF contains probabilistic constant declarations of the following form:
(1) one for each probabilistic constant , where , , and . We use to denote . In other words, PF
describes the probability distribution over each “random variable”
. 
is a set of rules of the form (identified with formula such that Head and Body do not contain implications, and Head contains no probabilistic constants.
The semantics of such a program is defined as a shorthand for program of the same signature as follows.

For each probabilistic constant declaration (1), contains, for each , (i) if ; (ii) if ; (iii) if .

For each rule in , contains

For each constant , contains the uniqueness of value constraints
(2) for all such that , and the existence of value constraint
(3)
3 Probabilistic +
3.1 Syntax
We assume a propositional signature as defined in Section 2.2. We further assume that the signature of an action description is divided into four groups: fluent constants, action constants, pf (probability fact) constants, and initpf (initial probability fact) constants. Fluent constants are further divided into regular and statically determined. The domain of every action constant is Boolean. A fluent formula is a formula such that all constants occurring in it are fluent constants.
The following definition of + is based on the definition of + language from [Babb and Lee (2015)].
A static law is an expression of the form
(4) 
where and are fluent formulas.
A fluent dynamic law is an expression of the form
(5) 
where and are fluent formulas and is a formula, provided that does not contain statically determined constants and does not contain initpf constants.
A pf constant declaration is an expression of the form
(6) 
where c is a pf constant with domain , for each ^{2}^{2}2We require for each for the sake of simplicity. On the other hand, if or for some , that means either can be removed from the domain of or there is not really a need to introduce as a pf constant. So this assumption does not really sacrifice expressivity., and . In other words, (6) describes the probability distribution of .
An initpf constant declaration is an expression of the form (6) where is an initpf constant.
An initial static law is an expression of the form
(7) 
where is a fluent constant and is a formula that contains neither action constants nor pf constants.
A causal law is a static law, a fluent dynamic law, a pf constant declaration, an initpf constant declaration, or an initial static law. An action description is a finite set of causal laws.
We use to denote the set of fluent constants, to denote the set of action constants, to denote the set of pf constants, and to denote the set of initpf constants. For any signature and any , we use to denote the set .
By we denote the result of inserting in front of every occurrence of every constant in formula . This notation is straightforwardly extended when is a set of formulas.
Example 1
The following is an action description in + for the transition system shown in Figure 1, is a Boolean regular fluent constant, and is an action constant. Action toggles the value of with probability . Initially, is true with probability and false with probability . We call this action description PSD. ( is a schematic variable that ranges over .)
( is a choice formula standing for .)
3.2 Semantics
Given a nonnegative integer denoting the maximum length of histories, the semantics of an action description in + is defined by a reduction to multivalued probabilistic program , which is the union of two subprograms and as defined below.
For an action description of a signature , we define a sequence of multivalued probabilistic program so that the stable models of can be identified with the paths in the transition system described by . The signature of consists of atoms of the form such that

for each fluent constant of , and ,

for each action constant or pf constant of , and .
For , we use to denote the subset of
For , we use to denote the subset of
We define to be the multivalued probabilistic program , where is the conjunction of
(8) 
for every static law (4) in and every ,
(9) 
for every fluent dynamic law (5) in and every ,
(10) 
for every regular fluent constant and every ,
(11) 
for every action constant , and consists of
(12) 
() for each pf constant declaration (6) in that describes the probability distribution of pf.
Also, we define the program , whose signature is . is the multivalued probabilistic program
where consists of the rule
for each initial static law (7), and consists of
for each initpf constant declaration (6).
We define to be the union of the two multivalued probabilistic program
Example 2
For the action description PSD in Example 1, is the following multivalued probabilistic program ():
and is the following multivalued probabilistic program ( is a schematic variable that ranges over ):
For any program of signature and a value assignment to a subset of , we say is a residual (probabilistic) stable model of if there exists a value assignment to such that is a (probabilistic) stable model of .
For any value assignment to constants in , by we denote the value assignment to constants in so that iff .
We define a state as an interpretation of such that is a residual (probabilistic) stable model of . A transition of is a triple where and are interpretations of and is an interpretation of such that is a residual stable model of . A pftransition of is a pair , where is a value assignment to such that is a stable model of .
A probabilistic transition system represented by a probabilistic action description is a labeled directed graph such that the vertices are the states of , and the edges are obtained from the transitions of : for every transition of , an edge labeled goes from to , where . The number is called the transition probability of .
The soundness of the definition of a probabilistic transition system relies on the following proposition.
Proposition 1
For any transition , and are states.
We make the following simplifying assumptions on action descriptions:

No Concurrency: For all transitions , we have for at most one ;

Nondeterministic Transitions are Controlled by pf constants: For any state , any value assignment of such that at most one action is true, and any value assignment of , there exists exactly one state such that is a pftransition;

Nondeterminism on Initial States are Controlled by Initpf constants: Given any assignment of , there exists exactly one assignment of such that is a stable model of .
For any state , any value assignment of such that at most one action is true, and any value assignment of , we use to denote the state such that is a pftransition (According to Assumption 2, such must be unique). For any interpretation , and any subset of , we use to denote the value assignment of to atoms in . Given any value assignment of and a value assignment of , we construct an interpretation of that satisfies as follows:

For all atoms in , we have ;

For all atoms in , we have ;

is the assignment such that is a stable model of .

For each ,
By Assumptions 2 and 3, the above construction produces a unique interpretation.
It can be seen that in the multivalued probabilistic program translated from , the probabilistic constants are . We thus call the value assignment of an interpretation on the total choice of . The following theorem asserts that the probability of a stable model under can be computed by simply dividing the probability of the total choice associated with the stable model by the number of choice of actions.
Theorem 1
For any value assignment of and any value assignment of , there exists exactly one stable model of that satisfies , and the probability of is
The following theorem tells us that the conditional probability of transiting from a state to another state with action remains the same for all timesteps, i.e., the conditional probability of given and correctly represents the transition probability from to via in the transition system.
Theorem 2
For any state and , and any interpretation of , we have
for any such that and .
For every subset of , let be the triple consisting of

the set consisting of atoms such that belongs to and ;

the set consisting of atoms such that belongs to and ;

the set consisting of atoms such that belongs to and .
Let be the transition probability of , is the interpretation of defined by , and be the interpretations of defined by .
Since the transition probability remains the same, the probability of a path given a sequence of actions can be computed from the probabilities of transitions.
Corollary 1
For every , is a residual (probabilistic) stable model of iff are transitions of and is a residual stable model of . Furthermore,
4 + Action Descriptions and Probabilistic Reasoning
In this section, we illustrate how the probabilistic extension of the reasoning tasks discussed in [Giunchiglia et al. (2004)], i.e., prediction, postdiction and planning, can be represented in + and automatically computed using lpmln2asp [Lee et al. (2017)]. Consider the following probabilistic variation of the wellknown Yale Shooting Problem: There are two (slightly deaf) turkeys: a fat turkey and a slim turkey. Shooting at a turkey may fail to kill the turkey. Normally, shooting at the slim turkey has chance to kill it, and shooting at the fat turkey has chance. However, when a turkey is dead, the other turkey becomes alert, which decreases the success probability of shooting. For the slim turkey, the probability drops to , whereas for the fat turkey, the probability drops to .
The example can be modeled in + as follows. First, we declare the constants:
Notation: range over . Regular fluent constants: Domains: , Loaded Boolean Statically determined fluent constants: Domains: Boolean Action constants: Domains: Load , Boolean Pf constants: Domains: , Boolean InitPf constants: , Init_Loaded Boolean
Next, we state the causal laws. The effect of loading the gun is described by . To describe the effect of shooting at a turkey, we declare the following probability distributions on the result of shooting at each turkey when it is not alert and when it is alert: , , , . The effect of shooting at a turkey is described as , , . A dead turkey causes the other turkey to be alert: , . ( stands for [Babb and Lee (2015)]).
The fluents Alive and Loaded observe the commonsense law of inertia: , , , . We ensure no concurrent actions are allowed by stating for every pair of action constants such that .
Finally, we state that the initial values of all fluents are uniformly random ( is a schematic variable that ranges over ): , , , .
We translate the action description into an program and use lpmln2asp to answer various queries about transition systems, such as prediction, postdiction and planning queries.^{3}^{3}3The complete lpmln2asp program and the queries used in this section are given in LABEL:ssec:yalelpmln2asp.
Prediction For a prediction query, we are given a sequence of actions and observations that occurred in the past, and we are interested in the probability of a certain proposition describing the result of the history, or the most probable result of the history. Formally, we are interested in the conditional probability
or the MAP state
where is a proposition describing a possible outcome, is a set of facts of the form or for , and is a set of facts of the form for and .
In the Yale shooting example, such a query could be “given that only the fat turkey is alive and the gun is loaded at the beginning, what is the probability that the fat turkey dies after shooting is executed?” To answer this query, we manually translate the action description above into the input language of lpmln2asp and add the following action and observation as constraints:
Executing the command
yields
Postdiction In the case of postdiction, we infer a condition about the initial state given the history. Formally, we are interested in the conditional probability
or the MAP state
where is a proposition about the initial state; and are defined as above.
In the Yale shooting example, such a query could be “given that the slim turkey was alive and the gun was loaded at the beginning, the person shot at the slim turkey and it died, what is the probability that the fat turkey was alive at the beginning?”
Formalizing the query and executing the command
yields
Planning In this case, we are interested in a sequence of actions that would result in the highest probability of a certain goal. Formally, we are interested in
where is a condition for a goal state, and is a sequence of actions specifying actions executed at each timestep.
In the Yale shooting example, such query can be “given that both turkeys are alive and the gun is not loaded at the beginning, generate a plan that gives best chance to kill both the turkeys with 4 actions.”
Formalizing the query and executing the command
5 Diagnosis in Probabilistic Action Domain
One interesting type of reasoning tasks in action domains is diagnosis, where we observe a sequence of actions that fails to achieve some expected outcome and we would like to know possible explanations for the failure. Furthermore, in a probabilistic setting, we could also be interested in the probability of each possible explanation. In this section, we discuss how diagnosis can be automated in + as probabilistic abduction and we illustrate the method through an example.
5.1 Extending + to Allow Diagnosis
We define the following new constructs to allow probabilistic diagnosis in action domains. Note that these constructs are simply syntactic sugars that do not change the actual expressivity of the language.

We introduce a subclass of regular fluent constants called abnormal fluents.

When the action domain contains at least one abnormal fluent, we introduce a special statically determined fluent constant with the Boolean domain, and add

We introduce the expression
where and are fluent formulas and is a formula, provided that does not contain statically determined constants and does not contain initpf constants. This expression is treated as an abbreviation of
Once we have defined abnormalities and how they affect the system, we can use
to enable taking abnormalities into account in reasoning.
5.2 Example: Robot
The following example is modified from [Iwan (2002)]. Consider a robot located in a building with two rooms r1 and r2 and a book that can be picked up. The robot can move to rooms, pick up the book and put down the book. There is a chance that it fails when it tries to enter a room, a chance that the robot drops the book when it has the book, and a chance that the robot fails when it tries to pick up the book. The robot, as well as the book, was initially at r1. It executed the following actions to deliver the book from r1 to r2: pick up the book; go to r2; put down the book. However, after the execution, it observes that the book is not at r2. What is a possible reason?
We answer this query by modeling the action domain in the probabilistic action language as follows. We first introduce the following constants. Notation: range over . Regular fluent constants: Domains: LocRobot, LocBook HasBook Boolean Abnormal fluent constants: Domains: EnterFailed, DropBook, PickupFailed Boolean Action constants: Domains: , PickUpBook, PutdownBook Boolean Pf constants: Domains: Pf_EnterFailed, Pf_PickupFailed, Pf_DropBook Boolean Initpf constants: Domains: Init_LocRobot, Init_LocBook Init_HasBook Boolean
The action causes the location of the robot to be at unless the abnormality EnterFailed occurs: .
Similarly, the following causal laws describe the effect of the actions PickupBook and PutdownBook: .
If the robot has the book, then the book has the same location as the robot: . The abnormality DropBook causes the robot to not have the book: .
The fluents LocBook, LocRobot and HasBook observe the commonsense law of inertia: .
The abnormality EnterFailed has chance to occur when the action is executed: .
Similarly, the following causal laws describe the condition and probabilities for the abnormalities PickupFailed and DropBook to occur: , . We ensure no concurrent actions are allowed by stating for every pair of action constants such that . Initially, it is uniformly random where the robot and the book is and whether the robot has the book:
No abnormalities are possible at the initial state:
We add
to the action description to take abnormalities into account in reasoning and translate the action description into program, together with the actions that the robot has executed.^{4}^{4}4We refer the reader to LABEL:ssec:robotlpmln2asp for the complete translation of the action description in the language of lpmln2asp.
Executing lpmln2asp i robot.lpmln yields
which suggests that the robot fails at picking up the book.
Suppose that the robot has observed that the book was in its hand after it picked up the book. We expand the action history with
Now the most probable stable model becomes
suggesting that robot accidentally dropped the book.
6 Related Work
There exist various formalisms for reasoning in probabilistic action domains. + [Eiter and Lukasiewicz (2003)] is a generalization of the action language + that allows for expressing probabilistic information. The syntax of is similar to +, as both the languages are extensions of . expresses probabilistic transition of states through socalled context variables, which are similar to pf constants in , in that they are both exogenous variables associated with predefined probability distributions. In , in order to achieve meaningful probability computed through , assumptions such as all actions have to be always executable and nondeterminism can only be caused by pf constants, have to be made. In contrast, does not impose such semantic restrictions, and allows for expressing qualitative and quantitative uncertainty about actions by referring to the sequence of “belief” states—possible sets of states together with probabilistic information. On the other hand, the semantics is highly complex and there is no implementation of as far as we know.
[Zhu (2012)] defined a probabilistic action language called , which is an extension of the (deterministic) action language . can be translated into Plog [Baral et al. (2004)] and since there exists a system for computing Plog, reasoning in action descriptions can be automated. Like and , probabilistic transitions are expressed through dynamic causal laws with random variables associated with predefined probability distribution. In , however, these random variables are hidden from the action description and are only visible in the translated Plog representation. One difference between and is that in a dynamic causal law must be associated with an action and thus can only be used to represent probabilistic effect of actions, while in , a fluent dynamic law can have no action constant occurring in it. This means state transition without actions or time step change cannot be expressed directly in . Like p+, in order to translate
into executable lowlevel logic programming languages, some semantical assumptions have to be made in
. The assumptions made in are very similar to the ones made in .Probabilistic action domains, especially in terms of probabilistic effects of actions, can be formalized as Markov Decision Process (MDP). The language proposed in
[Baral et al. (2002)] aims at facilitating elaboration tolerant representations of MDPs. The syntax is similar to . The semantics is more complex as it allows preconditions of actions and imposes less semantical assumption. The concept of unknown variables associated with probability distributions is similar to pf constants in our setting. There is, as far as we know, no implementation of the language. There is no discussion about probabilistic diagnosis in the context of the language. PPDDL [Younes and Littman (2004)] is a probabilistic extension of the planning definition language PDDL. Like , the nondeterminism that PPDDL considers is only the probabilistic effect of actions. The semantics of PDDL is defined in terms of MDP. There are also probabilistic extensions of the Event Calculus such as [D’Asaro et al. (2017)] and [Skarlatidis et al. (2011)].In the above formalisms, the problem of probabilistic diagnosis is only discussed in [Zhu (2012)]. [Balduccini and Gelfond (2003)] and [Baral et al. (2000)] studied the problem of diagnosis. However, they are focused on diagnosis in deterministic and static domains. [Iwan (2002)] has proposed a method for diagnosis in action domains with situation calculus. Again, the diagnosis considered there does not involve any probabilistic measure.
Compared to the formalisms mentioned here, the unique advantages of p+ include its executability through systems, its support for probabilistic diagnosis, and the possibility of parameter learning in actions domains.
is closely related to Markov Logic Networks [Richardson and Domingos (2006)], a formalism originating from Statistical Relational Learning. However, Markov Logic Networks have not been applied to modeling dynamic domains due to its limited expressivity from its logical part.
7 Conclusion
+ is a simple extension of +. The main idea is to assign a probability to each path of a transition system to distinguish the likelihood of the paths. The extension is a natural composition of the two ideas: In the semantics of +, the paths are encoded as stable models of the logic program standing for the + description. Since is a probabilistic extension of ASP, it comes naturally that by lifting the translation to turn into we could achieve a probabilistic action language.
In the examples above, the action descriptions, including the probabilities, are all handwritten. In practice, the exact values of some probabilities are hard to find. In particular, it is not likely to have a theoretical probability for an abnormality to occur. It is more practical to statistically derive the probability from a collection of action and observation histories. For example, in the robot example in Section 5.2, we can provide a list of action and observation histories, where different abnormalities occurred, as the training data. With this training data, we may learn the weights of the rules that control the probabilities of abnormalities.
Another future work is to build a compiler that automates the process of the translation of + description into the input language of lpmln2asp.
Acknowledgements: We are grateful to Zhun Yang and the anonymous referees for their useful comments. This work was partially supported by the National Science Foundation under Grant IIS1526301.
References
 Babb and Lee (2015) Babb, J. and Lee, J. 2015. Action language +. Journal of Logic and Computation, exv062.
 Balduccini and Gelfond (2003) Balduccini, M. and Gelfond, M. 2003. Diagnostic reasoning with AProlog. Theory and Practice of Logic Programming 3, 425–461.
 Baral et al. (2004) Baral, C., Gelfond, M., and Rushton, N. 2004. Probabilistic reasoning with answer sets. In Logic Programming and Nonmonotonic Reasoning. Springer Berlin Heidelberg, Berlin, Heidelberg, 21–33.
 Baral et al. (2000) Baral, C., Mcilraith, S., and Son, T. 2000. Formulating diagnostic problem solving using an action language with narratives and sensing.

Baral
et al. (2002)
Baral, C., Tran, N., and Tuan, L.C. 2002.
Reasoning about actions in a probabilistic setting.
In
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)
. 507–512.  D’Asaro et al. (2017) D’Asaro, F. A., Bikakis, A., Dickens, L., and Miller, R. 2017. Foundations for a probabilistic event calculus. CoRR abs/1703.06815.
 Eiter and Lukasiewicz (2003) Eiter, T. and Lukasiewicz, T. 2003. Probabilistic reasoning about actions in nonmonotonic causal theories. In Proceedings Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003). Morgan Kaufmann Publishers, 192–199.
 Ferraris et al. (2009) Ferraris, P., Lee, J., Lifschitz, V., and Palla, R. 2009. Symmetric splitting in the general theory of stable models. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI). 797–803.
 Gelfond and Lifschitz (1993) Gelfond, M. and Lifschitz, V. 1993. Representing action and change by logic programs. Journal of Logic Programming 17, 301–322.
 Gelfond and Lifschitz (1998) Gelfond, M. and Lifschitz, V. 1998. Action languages^{5}^{5}5http://www.ep.liu.se/ea/cis/1998/016/. Electronic Transactions on Artificial Intelligence 3, 195–210.
 Giunchiglia et al. (2004) Giunchiglia, E., Lee, J., Lifschitz, V., McCain, N., and Turner, H. 2004. Nonmonotonic causal theories. Artificial Intelligence 153(1–2), 49–104.
 Giunchiglia and Lifschitz (1998) Giunchiglia, E. and Lifschitz, V. 1998. An action language based on causal explanation: Preliminary report. In Proceedings of National Conference on Artificial Intelligence (AAAI). AAAI Press, 623–630.
 Iwan (2002) Iwan, G. 2002. Historybased diagnosis templates in the framework of the situation calculus. AI Communications 15, 1, 31–45.
 Lee et al. (2013) Lee, J., Lifschitz, V., and Yang, F. 2013. Action language : Preliminary report. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI).
 Lee and Meng (2013) Lee, J. and Meng, Y. 2013. Answer set programming modulo theories and reasoning about continuous changes. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI).
 Lee et al. (2017) Lee, J., Talsania, S., and Wang, Y. 2017. Computing LPMLN using ASP and MLN solvers. Theory and Practice of Logic Programming.
 Lee and Wang (2016) Lee, J. and Wang, Y. 2016. Weighted rules under the stable model semantics. In Proceedings of International Conference on Principles of Knowledge Representation and Reasoning (KR). 145–154.
 Richardson and Domingos (2006) Richardson, M. and Domingos, P. 2006. Markov logic networks. Machine Learning 62, 12, 107–136.
 Skarlatidis et al. (2011) Skarlatidis, A., Paliouras, G., Vouros, G. A., and Artikis, A. 2011. Probabilistic event calculus based on markov logic networks. In RuleBased Modeling and Computing on the Semantic Web. Springer, 155–170.
 Younes and Littman (2004) Younes, H. L. and Littman, M. L. 2004. PPDDL1. 0: An extension to pddl for expressing planning domains with probabilistic effects.
 Zhu (2012) Zhu, W. 2012. Plog: Its algorithms and applications. Ph.D. thesis, Texas Tech University.