1 Introduction
A frequent informal way of explaining the effect of default negation in an introductory class on semantics in logic programming (LP) is that a literal of the form ‘’ should be read as “there is no way to derive .” Although this idea seems quite intuitive, it is actually using a concept outside the discourse of any of the existing LP semantics: the ways to derive . To explore this idea, [1] introduced the socalled causal logic programs. The semantics was an extension of stable models [2] relying on the idea of “justification” or “proof”. Any true atom, in a standard (noncausal) stable model needs to be justified. In a causal stable model, the truth value of each true atom captures these possible justifications, called causes. Let us see an example to illustrate this.
Example 1
Suppose we have a row boat with two rowers, one at each side of the boat, port and starboard. The boat moves forward if both rowers strike at a time. On the other hand, if we have a following wind, the boat moves forward anyway.
Suppose now that we have indeed that both rowers stroke at a time when we additionally had a following wind. A possible encoding for this example could be the set of rules :
In the only causal stable model of this program, atom was justified by two alternative and independent causes. On the one hand, cause representing the joint interaction of and . On the other hand, cause inherited from . We label rules (in the above program only atoms) that we want to be reflected in causes. Unlabelled rules are just ignored when reflecting causal information. For instance, if we decide to keep track of the application of these rules, we would handle instead a program obtained just by labelling these two rules in as follows:
(1)  
(2) 
The two alternative justifications for atom become the pair of causes and . The informal reading of is that “the joint interaction of and , the cause , is necessary to apply rule .” From a graphical point of view, we can represent causes as proof trees.
In this paper, we show that causes can be embedded in an algebra with three internal operations: an addition ‘’ representing alternative justifications for a formula, a commutative product ‘’ representing joint interaction of causes (in a similar spirit to the ‘’ in [3]) and a noncommutative product ‘’ acting as a concatenation or rule application. Using these operations, we can see that justification for would correspond now to the value which means that is justified by the two alternative causes, and . The former refers to the application of rule to the join interaction of and . Similarly, the later refers to the application of rule to . From a graphical point of view, each cause corresponds to one of proof trees in the Figure 1, the right hand side operator of application corresponds to the head whereas the left hand side operator corresponds to the product of its children.
The rest of the paper is organised as follows. Section 2 describes the algebra with these three operations and a quite natural ordering relation on causes. The next section studies the semantics for positive logic programs and shows the correspondence between the syntactic proof tree of a standard (noncausal) logic program and the interpretation of each atom in a causal model. Section 4 introduces default negation and stable models. Finally, Section 5 concludes the paper.
2 Algebra of causal values
As we have introduced, our set of causal values will constitute an algebra with three internal operations: addition ‘’ representing alternative causes, product ‘’ representing joint interaction between causes and rule application ‘’. We define now causal terms, the syntactic counterpart of (causal) values, just as combinations of these three operations over labels (events).
Definition 1 (Causal term)
A causal term, , over a set of labels is recursively defined as one of the following expressions:
where is a label , are in their turn causal terms and is a (possibly empty or possibly infinite) set of causal terms. The set of causal terms over is denoted by .
As we can see, infinite products and sums are allowed whereas a term may only contain a finite number of concatenation applications. Constants and will be shorthands for the empty sum and the empty product , respectively.
We adopt the following notation. To avoid an excessive use of parentheses, we assume that ‘’ has the highest priority, followed by ‘’ and ‘’ as usual, and we further note that the three operations will be associative. When clear from the context, we will sometimes remove ‘’ so that, for instance, the term stands for . As we will see, two (syntactically) different causal terms may correspond to the same causal value. However, we will impose Unique Names Assumption (UNA) for labels, that is, for any two (syntactically) different labels , and similarly and for any label .
To fix properties of our algebra we recall that addition ‘’ represents a set of alternative causes and product ‘’ a set of causes that are jointly used. Thus, since both represent sets, they are associative, commutative and idempotent. Contrary, although associative, application ‘’ is not commutative. Note that the right hand side operator represents the applied rule and left hand one represents a cause that is necessary to apply it, therefore they are clearly not interchangeable. We can note another interesting property: application ‘’ distributes over both addition ‘’ and product ‘’. To illustrate this idea, consider the following variation of our example. Suppose now that the boat also leaves a wake behind when it moves forward. Let be the set of rules plus the rule reflecting this new assumption. As we saw, is justified by and thus will be justified by applying rule to it, i.e.the value . We can also see that there are two alternative causes justifying , graphically represented in the Figure 2.
The term that corresponds which this graphical representation is . Moreover, application ‘’ also distributes over product ‘’ and is equivalent to . Intuitively, if the joint iteration of and is necessary to apply then both and are also necessary to apply it, and conversely. Note that each chain of applications , , and corresponds to a path in one of the trees in the Figure 2. Causes can be seen as sets (products) of paths (causal chains).
Definition 2 (Causal Chain)
A causal chain over a set of labels is a sequence , or simply , with length and .
We denote to stand for the set of causal chains over and will use letters to denote elements from that set. It suffices to have a nonempty set of labels, say , to get an infinite set of chains , although all of them have a finite length. It is easy to see that, by an exhaustive application of distributivity, we can “shift” inside all occurrences of the application operator so that it only occurs in the scope of other application operators. A causal term obtained in this way is a normal causal term.
Definition 3 (Normal causal term)
A causal term, , over a set of labels is recursively defined as one of the following expressions:
where is a causal chain over and is a (possibly empty or possibly infinite) set of normal causal terms. The set of causal terms over is denoted by .
Proposition 1
Every causal term can be normalized, i.e. written as an equivalent normal causal term .
In the same way as application ‘’ distributes over addition ‘’ and product ‘’, the latter, in their turn, also distributes over addition ‘’. Consider a new variation of our example to illustrate this fact. Suppose that we have now two port rowers that can strike, encoded as the set of rules :
We can see that, in the only causal stable model of this program, atom was justified by two alternative, and independent causes, and , and after applying unlabelled rules to them, the resulting value assigned to is . It is also clear that there are two alternative causes justifying : the result from combining the starboard rower strike with each of the port rower strikes, and . That is, causal terms and are equivalent.
Furthermore, as we introduce above, causes can be ordered by a notion of “strength” of justification. For instance, in our example, is justified by two independent causes, while is only justified by . If we consider the program obtained by removing the fact from then keeps being justified by but becomes false. That is, is “more strongly justified” than in , written . Similarly, . Note also that, in this program , needs the joint interaction of and to be justified but and only need and , respectively. That is, is “more strongly justified” than , written . Similarly, . We can also see that in program which labels rules for , one of the alternative causes for is and this is “less strongly justified” than , i.e. since, from a similar reasoning, needs the application of to when only requires itself. In general, we will see that where can be either or . We formalize this order relation starting for causal chains. Notice that a causal chain can be alternatively characterized as a partial function from naturals to labels where for all and undefined for . Using this characterisation, we can define the following partial order among causal chains:
Definition 4 (Chain subsumption)
Given two causal chains and , we say that subsumes , written , if and only if there exists a strictly increasing function such that for each with defined, .
Proposition 2
Given two finite causal chains , they are equivalent (i.e. both and ) if and only if they are syntactically identical.
Informally speaking, subsumes , when we can embed into , or alternatively when we can form by removing (or skipping) some labels from . For instance, take the causal chains and . Clearly we can form by removing , and from . Formally, because we can take some strictly increasing function with and so that and .
Although, at a first sight, it may seem counterintuitive the fact that implies , as we mentioned, a fact or formula is “more strongly justified” when we need to apply less rules to derive it (and so, causal chains contain less labels) respecting their ordering. In this way, chain is a “more strongly justification” than .
As we saw above, a cause can be seen as a product of causal chains, that from a graphical point of view correspond to the set of paths in a proof tree. We notice now an interesting property relating causes and the “more strongly justified” order relation: a joint interaction of comparable causal chains should collapse to the weakest among them. Take, for instance, a set of rules :
where, in the unique causal stable model, corresponds to the value . Informally we can read this as “we need and apply rule to rule to prove ”. Clearly, we are repeating that we need . Term is redundant and then is simply equivalent to . This idea is quite related to the definition of order filter in order theory. An order filter of a poset is a special subset satisfying^{1}^{1}1Order filter is a weaker notion than filter which further satisfies that any pair has a lower bound in too. that for any and , implies . An order filter is furthermore generated by an element iff for all elements , the order filter generated by is written . Considering causes as the union of filters generated by their causal chains, the join interaction of causes just correspond to their union. For instance, if we consider the set of labels and its corresponding set of causal chains , then and respectively correspond to the set of all chains grater than and in the poset . Those are, and . The term corresponds just to the union of both sets . We define a cause as follows:
Definition 5 (Cause)
A cause for a set of labels is any order filter for the poset of chains . We will write (or simply when there is no ambiguity) to denote the set of all causes for .
This definition captures the notion of cause, or syntactically a product of causal chains. To capture possible alternative causes, that is, additions of products of causal chains, we notice that addition obeys a similar behaviour with respect to redundant causes. Take, for instance, a set of rules :
It is clear, that the cause is sufficient to justify , but there are also infinitely many other alternative and redundant causes , , that justify , that is . To capture a set of alternative causes we define the idea of causal value, in its turn, as a filter of causes.
Definition 6 (Causal Value)
Given a set of labels , a causal value is any order filter for the poset .
The causal value , the filter generated by the cause , is the set containing and all its supersets. That is, . Futhermore, as we will se later, addition can be interpreted as the union of causal values for its respective operands. Thus, just corresponds to the union of the causal values generated by their addend causes, .
The set of possible causal values formed with labels is denoted as . An element from has the form of a set of sets of causal chains that, intuitively, corresponds to a set of alternative causes (sum of products of chains). From a graphical point of view, it corresponds to a set of alternative proof trees represented as their respective sets of paths. We define now the correspondence between syntactical causal terms and their semantic counterpart, causal values.
Definition 7 (Valuation of normal terms)
The valuation of a normal term is a mapping defined as:
Note that any causal term can be normalized and then this definition trivially extends to any causal term. Furthermore, a causal chain is mapped just to the causal value generated by the cause, in their turn, generated by , i.e. the set containing all causes which contain . The aggregate union of an empty set of sets (causal values) corresponds to . Therefore , i.e. just corresponds to the absence of justification. Similarly, as causal values range over parts of , the aggregate intersection of an empty set of causal values corresponds to , and thus , i.e. just corresponds to the “maximal” justification.
Theorem 2.1 (From [4])
is the free completely distributive lattice generated by , and the restriction of to is an injective homomorphism (or embedding).
The above theorem means that causal terms form a complete lattice. The order relation between causal terms just corresponds to set inclusion between their corresponding causal values, i.e. iff . Furthermore, addition ‘’ and product ‘’ just respectively correspond to the least upper bound and the greater lower bound of the associated lattice or where:
for any normal term and . For instance, in our example , was associated with the causal term . Thus, the causal value associated with it corresponds to
Causal values are, in general, infinite sets. For instance, as we saw before, simply with we have the chains and contains all possible causes in that are supersets of , that is, . Obviously, writing causal values in this way is infeasible – it is more convenient to use a representative causal term instead. For this purpose, we define a function that acts as a right inverse morphism for selecting minimal causes, i.e., given a causal value , it defines a normal term such that and does not have redundant subterms. The function is defined as a mapping such that for any causal value , where and respectively stand for minimal causes of and minimal chains of . We will use to represent .
Proposition 3
The mapping is a right inverse morphism of .
Given a term we define its canonical form as . Canonical terms are of the form of sums of products of causal chains. As it can be imagined, not any term in that form is a canonical term. For instance, going back, we easily can check that terms and respectively correspond to the canonical terms and . Figure 3 summarizes addition and product properties while Figure 4 is analogous for application properties.
For practical purposes, simplification of causal terms can be done by applying the algebraic properties shown in Figures 3 and 4. For instance, the examples from and containing redundant information can now be derived as follows:
Let us see another example involving distributivity. The term can be derived as follows:
3 Positive programs and minimal models
Let us describe now how to use the causal algebra to evaluate causal logic programs. A signature is a pair of sets that respectively represent the set of atoms (or propositions) and the set of labels. As usual, a literal is defined as an atom (positive literal) or its negation (negative literal). In this paper, we will concentrate on programs without disjunction in the head, leaving the treatment of disjunction for a future study.
Definition 8 (Causal logic program)
Given a signature a (causal) logic program is a set of rules of the form:
where is a causal term over , is a literal or (the head of the rule) and is a conjunction of literals (the body of the rule). An empty body is represented as .
For any rule of the form we define . Most of the following definitions are standard in logic programming. We denote , (resp. ) to represent the conjunction of all positive (resp. negative) literals (resp. ) that occur in . A logic program is positive if is empty for all rules (), that is, if it contains no negations. Unlabelled rules are assumed to be labelled with the element which, as we saw, is the identity for application ‘’. (resp. ) represent truth (resp. falsity). If then can be dropped.
Given a signature a causal interpretation is a mapping assigning a causal value to each atom. Partial order is extended over interpretations so that given two interpretations we define for each atom . There is a bottom interpretation (resp. a top interpretation ) that stands for the interpretation mapping each atom to (resp. ). The set of interpretations with the partial order forms a poset with supremum ‘’ and infimum ‘’ that are respectively the sum and product of atom interpretations. As a result, also forms a complete lattice.
Observation 1
When the set of causal values becomes and interpretations collapse to classical propositional logic interpretations.
Definition 9 (Causal model)
Given a positive causal logic program and a causal interpretation over the signature , is a causal model, written , if and only if
for each rule of the form .
For instance, take rule (1) from Example 1 and let be an interpretation such that and . Then will be a model of (1) when . In particular, this holds when which was the value we expected for program . But it would also hold when, for instance, or . Note that this is important if we had to accommodate other possible additional facts or even in the program. The fact that any greater than is also a model clearly points out the need for selecting minimal models. In fact, as happens in the case of noncausal programs, positive programs have a least model (this time, with respect to relation among causal interpretations) that can be computed by iterating an extension of the wellknown direct consequences operator defined by [5].
Definition 10 (Direct consequences)
Given a positive logic program for signature and a causal interpretation , the operator of direct consequences is a function such that, for any atom :
In order to prove some properties of this operator, an important observation should be made: since the set of causal values forms now a lattice, causal logic programs can be translated to Generalized Annotated Logic Programming (GAP). GAP is a general a framework for multivalued logic programming where the set of truth values must to form an upper semilattice and rules (annotated clauses) have the following form:
(3) 
where are literals, is an annotation (may be just a truth value, an annotation variable or a complex annotation) and are values or annotation variables. A complex annotation is the result to apply a total continuous function to a tuple of annotations. For instance can be a complex annotation that applies the function to a mtuple of annotation variables in the body of (3). Given a positive program , each rule of the form
(4) 
is translated to an annotated clause of the form of (3) where are annotation variables that capture the causal values of each body literal. The head complex annotation corresponds to the function that forces the head to inherit the causal value obtained by applying the rule label to the product of the interpretation of body literals . The translation of a program is simply defined as:
A complete description of GAP restricted semantics, denoted as , is out of the scope of this paper (the reader is referred to [6]). For our purposes, it suffices to observe that the following important property is satisfied.
Theorem 3.1
A positive causal logic program can be translated to a general annotated logic program s.t. a causal interpretation if and only if . Furthermore, for any interpretation .
Corollary 1
Given a positive logic program the following properties hold:

Operator is monotonic.

Operator is continuous.

is the least model of .

The iterative computation reaches the least fixpoint in steps for some positive integer .
The existence of a least model for a positive program and its computation using is an interesting result, but it does not provide any information on the relation between the causal value it assigns to each atom with respect to its role in the program. As we will see, we can establish a direct relation between this causal value and the idea of proof in the positive program. Let us formalise next the idea of proof tree.
Definition 11 (Proof tree)
Given a causal logic program , a proof tree is a directed acyclic graph , where vertices are rules from the program, and satisfying:

There is at exactly one vertex without outgoing edges denoted as .

For each rule and for each atom there is exactly one with and this rule satisfies .
Notice that condition (ii) forces us to include an incoming edge for each atom in the positive body of a vertex rule. As a result, source vertices must be rules with empty positive body, or just facts in the case of positive programs. Another interesting observation is that, proof tree do not require an unique parent for each vertex. For instance, in Example 1, if both and were obtained as a consequence of some command made by the captain, we could get instead a proof tree, call it , of the form:
Definition 12 (Proof path)
Given a proof tree we define a proof path for as a concatenation of terms satisfying:

There exists a rule with such that is a source, that is, there is no s.t. .

For each pair of consecutive terms in the sequence, there is some edge s.t. and .

.
Let us write to stand for the set of all proof paths for a given proof tree . We define the cause associated to any tree as the causal term . As an example, . Also and correspond to each tree in Figure 1.
Theorem 3.2
Let be a positive program and be the least model of , then for each atom :
where is a set of proof trees with nodes .
From this result, it may seem that our semantics is just a direct translation of the syntactic idea of proof trees. However, the semantics is actually a more powerful notion that allows detecting redundancies, tautologies and inconsistencies. In fact, the expression may contain redundancies and is not, in the general case, in normal form. As an example, recall the program :
that has only one proof tree for whose cause would correspond to . But, by absorption, this is equivalent to pointing out that the presence of in rule is redundant.
A corollary of Theorem 3.2 is that we can replace a rule label by a different one, or by (the identity for application ‘’) and we get the same least model, modulo the same replacement in the causal values for all atoms.
Corollary 2
Let be a positive program, the least model of , be a label, and (resp. ) be the program (resp. interpretation) obtained after replacing each occurrence of by in (resp. in the interpretation of each atom in ). Then is the least model of .
In particular, replacing a label by has the effect of removing it from the signature. Suppose we make this replacement for all atoms in and call the resulting program and least model and respectively. Then is just the noncausal program resulting from after removing all labels and it is easy to see (Observation 1) that coincides with the least classical model of this program^{2}^{2}2Note that is Boolean: if assigns either or to any atom in the signature.. Moreover, this means that for any positive program , if is its least model, then the classical interpretation:
is the least classical model of ignoring its labels.
4 Default negation and stable models
Consider now the addition of negation, so that we deal with arbitrary programs. To illustrate this, we introduce a variation in Example 1 introducing the qualification problem from [7]: actions for moving the boat forward can be disqualified if an abnormal situation occurs (for instance, that the boat is anchored, any of the oars are broken, the sail is full of holes, etc.) . As usual this can be represented using default negation as shown in the set of rules :
The causes that justify an atom should not be a list of not occurred abnormal situations. For instance, in program where no abnormal situation occurs, the causal value that justify atom should be as in the program where abnormal situations are not included. References to the not occurred abnormal situations (, …) are not mentioned. Default negation does not affect the causes justifying an atom when the default holds. Of course, when the default does not hold, for instance adding the fact to the above program, becomes false. Thus, we introduce the following straightforward rephrasing of the traditional program reduct [2].
Definition 13 (Program reduct)
The reduct of a program with respect to an interpretation , written is the result of the following transformations on :

Removing all rules s.t.

Removing all negative literals from the rest of rules.
A causal interpretation is a causal stable model of a causal program if is the least model of . This definition allows us to extend Theorem 3.2 to normal programs in a direct way:
Theorem 4.1 (Main theorem)
Let be a causal program and be causal stable model of , then for each atom :
.
That is, the only difference now is that the set of proof trees is formed with rules whose negative body is not false (that is, they would generate rules in the reduct).
Corollary 3
Let be a normal program, a causal stable model of , be a label, and (resp. ) be the program (resp. interpretation) obtained after replacing every occurrence of by in (resp. in the interpretation of each atom in ). Then is a causal stable model of .
As in the case of positive programs, replacing a label by has the effect of removing it from the signature. Then, for any normal program , if is a causal stable model, then the classical interpretation:
is a classical stable model of ignoring its labels. It is easy to see that not only the above program has an unique causal stable model that corresponds to:
but also the program obtained from it ignoring the labels has an unique standard stable model that corresponds to the atoms whose interpretations differ from in the former.
5 Conclusions
In this paper we have provided a multivalued semantics for normal logic programs whose truth values form a lattice of causal chains. A causal chain is nothing else but a concatenation of rule labels that reflects some sequence of rule applications. In this way, a model assigns to each true atom a value that contains justifications for its derivation from existing rules. We have further provided three basic operations on the lattice: an addition, that stands for alternative, independent justifications; a product, that represents joint interaction of causes; and a concatenation that acts as a chain constructor. We have shown that this lattice is completely distributive and provided a detailed description of the algebraic properties of its three operations.
A first important result is that, for positive programs, there exists a least model that coincides with the least fixpoint of a direct consequences operator, analogous to [5]. With this, we are able to prove a direct correspondence between the semantic values we obtain and the syntactic idea of proof tree. The main result of the paper generalises this correspondence for the case of stable models for normal programs.
Many open topics remain for future study. For instance, ongoing work is currently focused on implementation, complexity assessment, extension to disjunctive programs or introduction of strong negation. Regarding expressivity, an interesting topic is the introduction of new syntactic operators for inspecting causal information like checking the influence of a particular event or label in a conclusion, expressing necessary or sufficient causes, or even dealing with counterfactuals. Another interesting topic is removing the syntactic reduct definition in favour of some full logical treatment of default negation, as happens for (noncausal) stable models and their characterisation in terms of Equilibrium Logic [8]. This would surely simplify the quest for a necessary and sufficient condition for strong equivalence, following similar steps to [9]. It may also allow extending the definition of causal stable models to an arbitrary syntax and to the first order case, where the use of variables in labels may also introduce new interesting features.
There are also other areas whose relations deserve to be formally studied. For instance, the introduction of a strong negation operator will immediate lead to a connection to Paraconsistency approaches. In particular, one of the main problems in the area of Paraconsistency is deciding which parts of the theory do not propagate or depend on an inconsistency. This decision, we hope, will be easier in the presence of causal justifications for each derived conclusion. A related area for which similar connections can be exploited is Belief Revision. In this case, causal information can help to decide which relevant part of a revised theory must be withdrawn in the presence of new information that would lead to an inconsistency if no changes are made. A third obvious related area is Debugging in Answer Set Programming, where we try to explain discrepancies between an expected result and the obtained stable models. In this field, there exists a pair of relevant approaches [10, 11]
to whom we plan to compare. Finally, as potential applications, our main concern is designing a high level action language on top of causal logic programs with the purpose of modelling some typical scenarios from the literature on causality in Artificial Intelligence.
References
 [1] Cabalar, P.: Causal logic programming. In Erdem, E., Lee, J., Lierler, Y., Pearce, D., eds.: Correct Reasoning. Volume 7265 of Lecture Notes in Computer Science., Springer (2012) 102–116
 [2] Gelfond, M., Lifschitz, V.: The stable model semantics for logic programming. In Kowalski, R.A., Bowen, K.A., eds.: Logic Programming: Proc. of the Fifth International Conference and Symposium (Volume 2). MIT Press, Cambridge, MA (1988) 1070–1080
 [3] Artëmov, S.N.: Explicit provability and constructive semantics. Bulletin of Symbolic Logic 7(1) (2001) 1–36
 [4] Stumme, G.: Free distributive completions of partial complete lattices. Order 14 (1997) 179–189
 [5] van Emden, M.H., Kowalski, R.A.: The semantics of predicate logic as a programming language. J. ACM 23(4) (1976) 733–742
 [6] Kifer, M., Subrahmanian, V.S.: Theory of generalized annotated logic programming and its applications. Journal of Logic Programming 12 (1992)
 [7] McCarthy, J.: Epistemological problems of artificial intelligence. In Reddy, R., ed.: IJCAI, William Kaufmann (1977) 1038–1044
 [8] Pearce, D.: Equilibrium logic. Ann. Math. Artif. Intell. 47(12) (2006) 3–41
 [9] Lifschitz, V., Pearce, D., Valverde, A.: Strongly equivalent logic programs. ACM Trans. Comput. Log. 2(4) (2001) 526–541
 [10] Gebser, M., Pührer, J., Schaub, T., Tompits, H.: A metaprogramming technique for debugging answerset programs. In Fox, D., Gomes, C.P., eds.: AAAI, AAAI Press (2008) 448–453
 [11] Pontelli, E., Son, T.C., ElKhatib, O.: Justifications for logic programs under answer set semantics. TPLP 9(1) (2009) 1–56