This paper presents an approach to integration of normal logic programs under the well-founded semantics with external first-order theories. The problem is motivated by the discussion about the rule level of the Semantic Web.
It is often claimed that rule-based applications need non-monotonic reasoning for handling negative information. As the issue of non-monotonic reasoning and negation was thoroughly investigated in the context of logic programming (see e.g. [AB94] for a survey of classical work on this topic), it would be desirable to build-up on this expertise. Well-established formal semantics: the answer set semantics [BG94], the well-founded semantics [vGRS88], and the 3-valued completion semantics of Kunen [Kun87] provide theoretical foundations for existing logic programming systems. On the other hand, applications refer usually to domain-specific knowledge, that is often supported by specific reasoning or computational mechanisms.
Domain-specific variants of logic programs are handled within the constraint logic programming framework CLP [MSW06]. In CLP the concept of constraint domain makes it possible to extend the semantics of pure logic programs and to use domain-specific constraint solvers for sound reasoning. Classical CLP does not support non-monotonic reasoning, but integration of both paradigms is discussed by some researchers (see e.g. [Stu95, DS98, Fag97]). Special kind of domain-specific knowledge are domain-specific terminologies, specified in a formal ontology description language, such as OWL [PSHH04]. This raises the issue of integration of rules and ontologies, which has achieved a considerable attention (see e.g. [Ros05, MSS05, EIST06, ADG05, dBET08, MR07] and references therein). It is commonly assumed that an ontology is specified as a set of axioms in (a subset of) First Order Logic (FOL), usually in a Description Logic. A rule language, typically a variant of Datalog, is then extended by allowing a restricted use of ontology predicates. The extensions considered in the literature are mostly based on disjunctive Datalog with negation under the answer set semantics.
In contrast to that, our focus is on normal logic programs under the well-founded semantics. Our objective is to extend them in such a way that domain-specific knowledge represented by a first order theory can be accessed from the rules. The theory will be called the external theory. Going beyond Datalog makes it possible to use data structures like lists for programming in the extended rule language. We want to define the semantics of the extended language so that the existing reasoners for normal logic programs and for the external theory can be re-used for querying the extended programs. Thus, our objective is to provide a framework for hybrid integration of normal logic programs and external theories. Integration of Datalog rules with ontologies specified in a Description Logic could then be handled within the framework as a special case.
The choice of the well-founded semantics as the basis for our approach is motivated by existence of top-down query answering algorithms, which facilitate building a query answering method for our framework. Notice also that the well-founded semantics and the answer set semantics are equivalent for a wide class of programs, including stratified normal programs.
We introduce a notion of hybrid program; such a program is a pair where is a set of axioms in a first order language and is a set of hybrid rules. and share function symbols but have disjoint alphabets of predicate symbols. This reflects the intuition that domain-specific knowledge is shared by many applications and is not redefined by the applications. A hybrid rule is a normal clause whose body may include a formula in , called the constraint of the rule. We define declarative semantics of hybrid programs as a natural extension of the well-founded semantics of logic programs. It is 3-valued; the semantics of a program is a set of ground literals over the alphabet of . The semantics is undecidable (as is the well-founded semantics for normal programs). However it is decidable for Datalog hybrid programs, provided the constraints are decidable.
The operational semantics presented in this paper is goal driven, allows non ground goals, and its internal data are possibly non ground goals and constraints. It combines a variant of SLS-resolution (see e.g. [AB94]) with handling constraints. The latter includes checking satisfiability of the constraints w.r.t. , which is assumed to be done by a reasoner of . Thus the operational semantics provides a basis for development of implementations integrating LP (logic programming) reasoners supporting/approximating the well-founded semantics (such as XSB Prolog [SSW07]) with constraint solvers. The operational semantics is sound w.r.t. the declarative one, under rather weak sufficient conditions. It is complete for Datalog hybrid programs under a certain syntactic condition of safeness.
In the special case of hybrid rules without non-monotonic negation the rules can be seen as the usual implications of the FOL, thus as additional axioms extending . In this case every ground atom which is true in the semantics of the hybrid program is a (2-valued) logical consequence of .
The paper is organized as follows. Section 2 gives an (informal) introduction to the well-founded semantics of normal logic programs, and presents the notion of constraint used in this paper. Basic ideas of Description Logics and their use for defining ontologies and ontological constraints are briefly discussed. Section 3 gives a formal presentation of the syntax and declarative semantics of the generic language of hybrid rules, parameterized by the constraint domain. Section 4 introduces the operational semantics; then soundness and completeness results relating the declarative semantics and the operational semantics are stated and proven. The last two sections contain discussion of related work and conclusions. A preliminary, abbreviated version of this work appeared as [DM07].
2.1 Normal logic programs and the well-founded semantics
In this work we use the standard terminology and notation of logic programming (see e.g. [Apt97]).
The language of hybrid rules will be defined as an extension of normal logic programs. We assume that the programs are built over a first-order alphabet including a set of predicates, a set of variables and a set of function symbols with different arities including a non-empty set of symbols of arity 0, called constants.
Atomic formulae (or atoms) and terms are built in a usual way. A literal is an atomic formula (positive literal) or a negated atomic formula (negative literal). A literal (a term) not including variables is called ground.
A normal logic program is a finite set of rules of the form
where is an atomic formula, and are literals. The rules are also called normal clauses. The rules with empty bodies () are called facts or unary clauses; they are usually written without . A normal clause is called definite clause iff all literals of its body are positive. A definite program is a finite set of definite clauses. In this paper, a Datalog program is a normal logic program with being a finite set of constants.
The Herbrand base is the set of all ground atoms built with the predicates, constants, and function symbols of . For a subset , by we denote the set of negations of the elements of , . A ground instance of a rule is a rule obtained by replacing each variable of by a ground term over the alphabet. The set of all ground instances of the rules of a program will be denoted . Notice that in the case of Datalog is a finite set of ground rules.
A 3-valued Herbrand interpretation (shortly – interpretation) of is a subset of such that for no ground atom both and are in . Intuitively, the set assigns the truth value (true) to all its members. Thus is false (has the truth value ) in iff , and is false in iff . If and then the truth value of (and that of ) is (undefined). This is in a natural way generalized to non ground atoms and non atomic formulae (see e.g. [AB94]). An interpretation is a model of a formula (which is denoted by ) iff is true in .
As usual, a 2-valued Herbrand interpretation is a subset of . It assigns the value to all its elements and the value to all remaining elements of the Herbrand base. It is well known that any definite program has a unique least111in the sense of set inclusion. 2-valued Herbrand model. We will denote it . A normal program may not have the least Herbrand model.
The well-founded semantics of logic programs [vGRS88] assigns to every program a unique (three valued) Herbrand model, called the well-founded model of . Intuitively, the facts of a program should be true, and the ground atoms which are not instances of the head of any rule should be false. This information can be used to reason which other atoms must be true and which must be false in any Herbrand model. Such a reasoning gives in the limit the well-founded model, where the truth values of some atoms may still be undefined. The well-founded semantics has several equivalent formulations. We briefly sketch here a definition following that of [FD93].
While defining the well-founded model, for every predicate symbol we will treat as a new distinct predicate symbol. A normal program can thus be treated as a definite program over Herbrand base . A 3-valued interpretation over can be treated as a 2-valued interpretation over .
Let be such an interpretation (). We define two ground, possibly infinite, definite programs and . For a given program , is the ground instantiation of together with ground unary clauses that show which negative literals are true in .
is similar but all the negative literals that are true or undefined in are made true here:
Now we define an operator which produces a new Herbrand interpretation of :
It can be proved that the operator is monotonic; whenever . Its least fixed point is called the well-founded model of program . For some countable ordinal we have .
The following example shows a simple Datalog program and its well-founded model.
A two person game consists in moving a token between vertices of a directed graph. Each move consists in traversing one edge from the actual position. Each of the players in order makes one move. The graph is described by a database of facts corresponding to the edges of the graph. A position is said to be a winning position if there exists a move from to a position which is a losing (non-winning) position:
Consider the graph
and assume that it is encoded by the facts of the program. The winning positions are . The losing positions are . Position is not a losing one since the player has an option of moving to from which the partner can only return to . This intuition is properly reflected by the well-founded model of the program, it contains the following literals with the predicate symbol : .
A non Datalog version of this example with an infinite graph is presented in [Dra93].
2.2 External theories
In this section we discuss logical theories to be integrated with logic programs.
Our objective is to define a general framework for extending normal logic programs, which, among others, can also be used for integration of Datalog rules with ontologies. Syntactically, the clauses of a logic program are extended by adding certain formulae of a certain logical theory. The added formulae will be called constraints. We use this term due to similarities with constraint logic programming [MSW06].
We will consider a 2-valued FOL theory, called external theory or constraint theory. A set of its formulae is chosen as the set of constraints. Our operational semantics imposes certain restrictions on the set of constraints. They are introduced together with the operational semantics. The declarative semantics works for an arbitrary set of constraints. The function symbols and the variables of the language of the external theory are the same as those of the language of rules. On the other hand, the predicate symbols of both languages are distinct. We will call them constraint predicates and rule predicates. We assume that the external theory is given by a set of axioms and the standard consequence relation of the FOL, or equivalently the logical consequence . (Other consequence operations can be used instead; for instance deriving those formulae which are true in a canonical model of , or in a given class of models.) We will sometimes use as the name of the theory.
Sometimes one deals with an external theory whose set of function symbols is a proper subset of the set of function symbols of the rules. For instance the external theory uses only constants, and the rules employ term constructors (i.e. non constant function symbols). In such case we simply extend the alphabet of the external theory so that its set of function symbols is . The modified external theory is a conservative extension of the original one [Sho67]. A formula without symbols from is a logical consequence of in one of them iff it is a logical consequence in the other. Thus such modification of the external theory is inessential; this justifies our assumption of a common alphabet of function symbols.
2.2.2 Ontologies and ontological constraints
This section surveys some basic concepts of Description Logics (DLs) [BCM03] and the use of DLs for specifying ontologies. An ontology may be defined as a “specification of a conceptualization” [Gru95]. An ontology should thus provide a formal definition of the terminology to be shared.
Desciption Logics are specific fragments of the FOL. The syntax of a DL is built over disjoint alphabets of class names, property names and individual names. From the point of view of FOL they are, respectively, one and two argument predicate symbols, and constants. Depending on the kind of DL, different constructors are provided to build class expressions (or briefly classes) and property expressions (or briefly properties). Some DLs allow also to represent concrete datatypes, such as strings or integers. In that case one distinguishes between individual-valued properties and data-valued properties.
By an ontology we mean a finite set of axioms in some decidable DL. The axioms describe classes and properties of the ontology and assert facts about individuals and data. An ontology is thus a DL knowledge base consisting of two parts: a TBox (terminology) including class axioms and property axioms and an ABox (assertions) stating facts about individuals and data. The axioms of DLs can be seen as an alternative representation of FOL formulae. Thus, the semantics of DLs is defined by referring to the usual notions of interpretation and model, and an ontology can be considered a FOL theory.
For most of decidable DLs there exist well developed automatic reasoning techniques. Given an ontology in a DL one can use a respective reasoner for checking if a formula is a logical consequence of . If and then is true in some models of the ontology and false in some other models.
Ontologies are often specified in the standard Web ontology language OWL DL, based on the Description Logic . OWL Ontologies can be seen as set of axioms in this DL.
OWL DL class axioms make it possible to state class equivalence and class inclusion , where is a class name and is a class expression. Class expressions are built from class names using constructors, such as (the universal concept), (the bottom concept), intersection, union and complement. Classes can also be described by direct enumeration of members and by restrictions on properties (for more details see [PSHH04]).
Property axioms make it possible to state inclusion and equivalence of properties, specify the domain and the range of a property, state that a property is symmetric, transitive, functional, or inverse functional.
OWL DL assertions indicate members of classes and properties. Individuals are referred to by individual names. It is possible to declare that given individual names represent the same individual or that each of them represents a different individual.
The following example using some expressive constructions of OWL DL will be used in the sequel to discuss how integration of Datalog with OWL DL ontologies is achieved in our framework.
In some research area an author of at least 3 books is considered an expert. An OWL DL ontology referring to this research area has classes and , and a property with domain and range . The class can now be defined using OWL DL cardinality restriction:222In the Manchester OWL Syntax http://www.co-ode.org
The property has the inverse property . The following class expression defines the class of authors which co-authored a book with a given author (e.g. )
All individuals of class which appear in the ontology are declared as distinct. The ontology states that the individuals and of class are the same. (This may happen e.g. due to a change of the name of a person). There are also authors and ; , and are declared to be distinct. In addition, the ontology asserts that is the author of the books and is the author of the books . Thus an OWL DL reasoner will conclude that () is an expert.
2.3 Datalog with Constraints: Introductory Examples
We now illustrate the idea of adding constraints to rule bodies on two simple examples. The intention is to give an informal introduction to the semantics of hybrid rules. The first example will be used later on to accompany the formal presentation of the declarative and operational semantics of our framework. The second one illustrates some aspects of expressing external theories in OWL DL.
The example describes a variant of the game from Example 2.1 where the rules are subject to additional restrictions. Assume that the positions of the graph represent geographical locations described by an ontology. The ontology provides, among others, the following information
subclass relations (TBox axioms): e.g. (locations in Finland are locations in Europe);
classification of some given locations represented by constants (ABox axioms). For instance, assuming that the positions of Example 2.1 represent locations we may have ( is a location in Finland), ( is a location in Europe).
We now add some restrictions as ontological constraints333 Symbol is used to denote two kinds of negation. Within a constraint it is the classical negation of the external theory. When applied to a rule predicate, denotes nonmonotonic negation. Thus two distinct negation symbols are not needed. added to the facts and :
Intuitively, this would mean that the move from to is allowed only if is in Europe and the move from to – only if is not in Finland. These restrictions may influence the outcome of the game: will still be a losing position but if the axioms of the ontology do not allow to conclude that is in Europe, we cannot conclude that is a winning position. However, we can conclude that if is not in Europe then it cannot be in Finland. Thus, at least one of the conditions holds. If then, as in Example 2.1, is a winning position, is a losing one, hence is a winning position. On the other hand, if then the move from to is allowed, in which case is a winning position. Therefore is always a winning position; is considered to be a consequence of the program.
A committee of reviewers is to be created for evaluation of the applicants for an opened position. A reviewer has to be an expert, as defined by the ontology of Example 2.2 and must not have a conflict of interest (coi) with an applicant. Persons who are co-authors of a book have coi. (This implies that an author of a book has coi with himself/herself; this applies in particular to each expert). Additionally, some conflicts of interest are declared by facts.
The following rules define a potential reviewer for a candidate (relation ). Two constraints are used: and . They refer to the ontology of Ex. 2.2.
The intention is to query the rules and the ontology for checking if a given person may be a reviewer for a given candidate. Consider the individual of Example 2.2 and check if she might be appointed a reviewer for some of the people named in the ontology. An OWL DL reasoner can check that is an expert and that she has the conflict of interest with herself, i.e. with alias . The conflict of interest with is stated explicitly. So cannot be appointed a reviewer for herself and for .
To check if has the conflict of interest with one has to refer to the ontology for checking if they co-authored a book. If this is confirmed by the reasoner (e.g. when the ontology asserts that both and are authors of ) then is true and cannot be a reviewer for . If non-existence of any co-authored book follows from the ontology, then is false and can be a reviewer. Otherwise444 They are co-authors in some models of the ontology, and are not in some others. may be a reviewer for under the condition that they did not co-authored a book. This constraint should be returned in the answer to the query.
An example employing non-nullary function symbols is given in [DHM07a]. The semantics of hybrid programs presented below formalizes the intuitions presented in the examples of this section.
3 Integration of rules and external theories
This section defines the syntax and the (declarative) semantics of hybrid programs, integrating normal rules with first-order theories. The general principles discussed here apply in a special case to integration of Datalog with ontologies specified in Description Logics.
We consider a first-order alphabet including, as usual, disjoint alphabets of predicate symbols , function symbols (including a set of constants) and variables . We assume that consists of two disjoint sets (rule predicates) and (constraint predicates). The atoms and the literals constructed with these predicates will respectively be called rule atoms (rule literals) and constraint atoms (constraint literals). We will combine rules over alphabets with an external theory over , employing constraints (a distinguished set of formulae of ).
A hybrid rule (over ) is an expression of the form:
where, each is a rule literal and is a constraint (over ); is called the constraint of the rule.
A hybrid program is a pair where is a set of hybrid rules and is a set of axioms over .
Hybrid rules are illustrated in Example 2.3. We adopt a convention that a constraint true, which is a logical constant interpreted as , is omitted. Usually we do not distinguish between sequences, like , and conjunctions, like . Notation will be used to denote a sequence of rule literals (similarly a sequence of terms, etc.); will denote a conjunction of equalities .
3.2 Declarative Semantics
The declarative semantics of hybrid programs is defined as a generalization of the well-founded semantics of normal programs; it refers to the models of the external theory of a hybrid program. Given a hybrid program we cannot define a unique well-founded model of since we have to take into consideration the logical values of the constraints in the rules. However, a unique well-founded model can be defined for any given model of . Roughly speaking, the constraints in the rules are replaced by their logical values in the model ( or ); then the well-founded model of the obtained logic program is taken. The well-founded models are over the Herbrand universe, but the models of are arbitrary.
By applying a substitution to a formula we mean applying it to the free variables of . Moreover, if a bound variable of occurs in some () then in is replaced by a new variable.
By a ground instance of a hybrid rule , where is the constraint of the rule, we mean any rule , where is a substitution replacing the free variables of by ground terms (over the alphabet ). So the constraint has no free variables, and are ground literals. By we denote the set of all ground instances of the hybrid rules in .
Let be a hybrid program and let be a model of . Let be the normal program obtained from by
removing each rule constraint which is true in (i.e. ),
removing each rule whose constraint is not true in , (i.e. ).
The well-founded model of is called the well-founded model of based on .
A formula (over ) holds (is true) in the well-founded semantics of a hybrid program (denoted ) iff for each well-founded model of .
Notice that the negation in the rule literals is non-monotonic, and the negation in the constraints is that from the external theory, thus monotonic.
We say that is false in the well-founded semantics of if , and that is undefined if the logical value of in each well-founded model of is . There is a fourth case: has distinct logical values in various well-founded models of . Formally, the semantics of does not assign any truth value to such . We may say that its truth value depends on the considered model of the external theory. Classes of models in which has a specific truth value can by characterized by constraints. Such constraints provide sufficient conditions for to have the specific truth value. They are constructed by the proposed operational semantics.
For the hybrid program of Example 2.3 we have to consider models of the ontology . For every model of such that the program includes the fact . The well-founded model of includes thus the literals (independently of whether ).
On the other hand, for every model of the ontology such that the program includes the fact . The well-founded model of includes thus the literals (independently of whether ).
Notice that each of the models of the ontology falls in one of the above discussed cases. Thus, and hold in the well-founded semantics of the hybrid program, and the logical value of and that of is in each well-founded model of the program. On the other hand and are true in those well-founded models of for which the constraint is true in . Similarly, and are true in those models for which is false. Thus the well-founded semantics assigns unique truth values to and , but not to and . The truth values of and can be characterized by additional constraints.
Consider a case of hybrid rules without negative rule literals. So the non-monotonic negation does not occur. Such rules can be seen as implications of FOL and treated as axioms added to . For such case the well-founded semantics is compatible with FOL in the following sense: For any ground rule atom if then .555 The reverse implication does not hold. As a counterexample take and . but , as there exist models of in which each ground atom is false. We can obtain (something close to) the reverse implication by considering only those well-founded models which are based on Herbrand models of . If then for each well-founded model of based on a Herbrand interpretation of . We omit a detailed proof.
As the well-founded semantics of normal programs is undecidable, so is the well-founded semantics of hybrid programs. It is however decidable for Datalog hybrid programs with decidable external theories (Section 5). In Section 4 we show that sound reasoning is possible (for arbitrary hybrid programs) by appropriate generalization of SLS-resolution. For the Datalog case the proposed reasoning scheme is complete under a certain safeness condition.
3.3 Treatment of Equality
In this section we discuss how equality is treated by the declarative semantics introduced above. The semantics is based on Herbrand models. Thus it treats distinct ground terms as having different values.
Consider a hybrid program , where . Both and hold in the well-founded semantics of , even if implies that and are equal. This feature of the semantics of hybrid programs may be found undesirable.
We will call this phenomenon the problem of two equalities. Below we first show that the problem is well known in constraint logic programming (CLP) and explain how it is dealt with. Then we discuss two more formal ways of avoiding it: external theories where equality satisfies Clark equality theory (CET), and hybrid rules which are congruent w.r.t. a given external theory.
The problem of two equalities is familiar from CLP [MSW06], and is not found troublesome in practice. Most CLP implementations employ both syntactic equality and equality of the constraint domain666 See for instance the comment on an example constraint domain on p. 414 in [MSW06, Section 12.2]. . Let us denote the latter by (and use for the syntactic equality of the Herbrand domain). Formally, let us treat as equality, and as an equivalence relation. As an example consider CLP over arithmetic constraints [MSW06]. Terms and are distinct but denote the same number, we have and . Constraint predicates treat and as equal. (Formally, is a congruence of the constraint predicates: iff whenever , for any constraint predicate .) Other predicates may distinguish such terms. This is related to using unification in the operational semantics; unification is related to the syntactic equality.
Apparently the programmers find this feature natural and not confusing. They are aware of dealing both with the Herbrand interpretation and with a non Herbrand one. They know that the latter is employed only by constraint predicates. They take care of distinguishing the two corresponding equalities. For instance to express a fact that should be true for the number , a rule will be used (instead of a fact ).
It what follows we refer to the free equality theory (CET, Clark equality theory) [Cla78]. CET consists of equality axioms
and freeness axioms
If the set of function symbols is finite then CET additionally contains the weak domain closure axiom WDCA:777 This axiom is needed for CET to be complete, in the sense that any closed formula (with as its only predicate symbol) has the same logical value in each model of CET. Consider for instance and . This formula is true in some models of CET without WDCA, but false in its (unique) Herbrand model.
When contains only constants then CET reduces to the unique name assumption (UNA).
Assume that we have an external theory with equality . We say that a set of hybrid rules is congruent for a predicate symbol w.r.t. when implies
for any ground terms . When is congruent w.r.t. for any rule predicate then we say that is congruent w.r.t. (or shortly that is congruent).
Program (from Ex. 3.4) is not congruent w.r.t. any in which .
The hybrid program from Ex. 2.4 with the fact removed is congruent888 The unchanged program is not congruent, unless the ontology implies that () co-authored a book with . This is because holds and does not hold in the well-founded semantics of the program, but . , independently from .
Example 3.6 (Constructing congruent programs)
Consider the program from Examples 2.3, 3.3. The program implies and (formally and ). Assume that implies that . For instance, the equality may be explicitly stated by an owl:sameAs assertion. Informally, equality is incompatible with ; the rules of treat differently the objects , while states that they are equal. Formally, is not congruent.
One can modify to make it treat in the same way. It is sufficient to add rules , , and . (We replace by in the rules of ). Now and hold in the well-founded semantics of the obtained program . The program is congruent, provided that does not imply for any other pair of constants occurring in the program.
We can modify to make it congruent independently from . The idea is to replace (implicit) by explicit . For instance we may replace in the rule by
The obtained program is congruent for , w.r.t. any . Alternatively, the rules for can be modified in a similar way to make the program congruent for . Then the program is also congruent for (without modifying the rule for ).
The program transformations above can be seen as usual CLP programming tricks.
For congruent hybrid programs the problem of two equalities does not exist. (Also, it does not exist for external theories without equality.) The example above informally introduces programming techniques for constructing congruent programs. Now we present two simple criteria assuring that a program is congruent. (Congruency is undecidable, like other non trivial semantic properties of programs.)
First, if the equality of satisfies CET then each program is congruent. (As then if are ground terms then implies that the terms are identical.) Apparently for this reason some approaches of combining rules and ontologies require that the ontology satisfies the unique name assumption (UNA).
Another sufficient criterion is syntactic. Program is congruent if in each rule of all the arguments of the head are variables, and any variable occurs at most once in . (Thus the remaining occurrences of the variable are in the constraint of the rule.) The proof that such is congruent is based on the fact that for any model of if (for ground terms ) then a rule is in iff is in .
As an example, notice that the rule for in (from Ex. 3.6) satisfies the sufficient condition, and the rules for do not. Notice also that the condition is different from usually considered safeness conditions (the former – roughly speaking – forbids certain variable occurrences, while the latter require).
It is rather obvious how to construct programs satisfying this syntactic restriction, provided that the set of constraints includes equalities of . Instead of placing a non variable term as an argument of the head of a rule, use a new variable and add to the constraint of the rule. Instead of writing more than one occurrences of a variable in the rule literals of a rule, replace each (but one) occurrence of by a new distinct variable and add to the constraint of the rule.
4 Reasoning with hybrid rules
Now we present a way of computing the well-founded semantics of Definition 3.2. Like in logic programming, the task is to find instances of a given goal formula which are true in the well-founded semantics of a given program. Similarly to logic programming, our operational semantics is defined in terms of search trees. After introducing the operational semantics we prove its soundness and completeness, the latter for a restricted class of programs.
4.1 Constraints for the operational semantics
To construct the operational semantics we impose certain requirements on the external theory and the set of constraints. We need to deal explicitly with the syntactic equality and its negation. So we require that = is a constraint predicate symbol and the external theory includes the axioms CET (cf. Section 3.3). An external theory which does not satisfy this condition can be easily converted to a which does. ( may be a theory without equality, or contain equality not satisfying CET.) Namely is the union . Reasoning in such can be implemented employing Prolog and a reasoner for [DHM07b]. The former deals with , the latter with the predicates of .
The operational semantics constructs new constraints using conjunction, disjunction, negation, and existential quantification. So we require that the set of constraints is closed under these operations. This imposes restrictions on the constraints. For instance many DLs do not allow negation of roles; for such DL a formula of the form cannot be a constraint. The actual choice of constraints is outside of the scope of this paper. It depends on the chosen external theory and the available reasoner for it. For instance, if a formula is a constraint without then the reasoner should be able to check whether is satisfiable in (where and are as above).
4.2 Operational Semantics
The operational semantics presented below is a generalization of SLS-resolution [Prz89], which is extended by handling constraints originating from the hybrid rules. It is based on the constructive negation approach presented in [Dra93, Dra95]. In logic programming, the term constructive negation stands for generalizations of negation as failure (NAF) (see e.g. [AB94]). NAF provides a way of checking whether a given negative goal is a consequence of the program (under a relevant semantics). Constructive negation, roughly speaking, finds instances of a negative goal which are consequences. The main contribution of the operational semantics presented here is dealing with hybrid programs and arbitrary external theories. The constructive negation method of [Dra93, Dra95] dealt with logic programs, the equality was the only constraint predicate and CET was the constraint theory.
The operational semantics is similar to SLDNF- and SLS-resolution [Llo87, Prz89]. For an input goal a derivation tree is constructed; its nodes are goals. Whenever a negative literal is selected in some node, a subsidiary derivation tree is constructed. So a tree of trees is obtained.
By the restriction of a formula to a set of variables we mean the formula where are those free variables of that are not in . By we mean , where are the free variables of formula .
By a goal we mean a conjunction of the form (), where each is a rule literal and is a constraint (the constraint of the goal). Consider a goal and a rule , such that no variable occurs both in and . We say that the goal
is derived from by , with the selected atom , if the constraint is satisfiable.
We inductively define two kinds of derivation trees: t-trees and tu-trees. Their role is to find out when a given goal is , or respectively when it is or . Informally, if a constraint is a leaf of a t-tree with the root then implies that is in the well-founded semantics of the program. (More generally, the same holds if is a disjunction of such leaves.) On the other hand, for a tu-tree we define a notion of its cross-section. If are the constraints of the goals of a cross-section of a tu-tree with the root then, roughly speaking, implies that is in the well-founded semantics of the program. A formal explanation is provided by the soundness theorem (4.9) in the next section.
For correctness of the definition (to avoid circularity) we assign ranks to the trees. This is a standard technique employed in similar definitions [Llo87, Prz89, Dra95]. In the general case ranks are countable ordinals, but natural numbers are sufficient for a language where the function symbols are constants. The children of nodes with an atom selected are defined as in the standard SLD-resolution. The only difference is that instead of explicit unification we employ equality constraints. The children of nodes with a negative literal selected are constructed employing the results of tu- (t-) trees of lower rank. A t-tree refers to tu-trees and vice versa. This is basically a reformulation of the corresponding definitions of [Dra93, Dra95].
Definition 4.2 (Operational semantics)
A t-tree (tu-tree) of rank for a goal w.r.t. a program satisfies the following conditions. The nodes of the tree are (labelled by) goals. In each node a rule literal is selected, if such a literal exists. A node containing no rule literal is called successful, a branch of the tree with a successful leaf is also called successful.
A constraint ()999 If then by we mean false, and by we mean true. is an answer of the t-tree if are (some of the) successful leaves of the t-tree. (It is not required that all the successful leaves are taken.)
By a cross-section (or frontier) of a tu-tree we mean a set of tree nodes such that each successful branch of the tree has a node in . Let be a cross-section of the tu-tree and the constraints of the nodes in .
If is finite then the constraint (the negation of ) is called a negative answer of the tu-tree.
If is infinite then a constraint which implies for each is called a negative answer of the tu-tree. Moreover it is required that each free variable of is a free variable of .
If (in the t-tree or tu-tree) the selected literal in a node is an atom then, for each rule of , a goal derived from with selected by a variant of is a child of , provided such a goal exists. Moreover it is required that no variable in occurs in the tree on the path from the root to .
Consider a node of the t-tree (tu-tree), in which the selected literal is negative. The node is a leaf or has one child, under the following conditions.
If the tree is a t-tree then
is a leaf, or
has a child , where is a negative answer of a tu-tree for of rank , and is satisfiable.
If the tree is a tu-tree then
has a child , or
has a child , where is the negation of an answer of a t-tree for of rank , and is satisfiable, or
is a leaf and there exists an answer of a t-tree for of rank such that is unsatisfiable.
An informal explanation for case 2 is that the constraints of the cross-section include all the cases in which is or , thus their negation implies that is . A useful intuition is that adding a negative answer to the nodes of the tu-tree results in a failed tree – a tree for without any successful leaf. (For the constraint of any node of the cross-section, the constraint is unsatisfiable. The same holds for any node which is a descendant of some node of the cross-section.)
An informal explanation for case 4 is that in a t-tree (case 4(a)ii) implies that is , equivalently is . Hence implies that is . In a tu-tree (4(b)ii) includes all the cases in which is not . Hence – the constraint of the child – includes all the cases in which is not , equivalently in which is or .
Notice that in case 4(a)i the node may unconditionally be a leaf of a t-tree (of any rank). This corresponds to the fact that is a negative answer for any tu-tree for . (Take the cross-section ). Hence in the supposed child of (case 4(a)ii) the constraint is unsatisfiable. Conversely, according to 4(b)i, node in a tu-tree may have as the child. This corresponds to the fact that is an answer of any t-tree. Hence is equivalent to (which is the constraint obtained in 4(b)ii). Thus 4(b)i is a special case of 4(b)ii.
Consider a query for the hybrid program of Example 2.3. It can be answered by the operational semantics by construction of the following trees. (Sometimes we replace a constraint by an equivalent one.)
A t-tree for :
The tree refers to negative answers derived in the cases 2, 4 below. The constraint in the second leaf is equivalent to (as is equivalent to ). The answer obtained from the two leaves of the tree is . It is equivalent to . As this constraint is a logical consequence of the ontology, holds in each well-founded model of the program.
A tu-tree for , employing an answer from the t-tree from case 3:
The leaf is equivalent to , see the explanation in the previous case. Hence from the cross-section containing the leaf we obtain a negative answer equivalent to and to .
A t-tree for employing a negative answer from case 4:
The corresponding answer is (equivalent to) . Informally, the answer implies . From Lemma 4.8 below it follows that if holds in some model of then is true in the corresponding well founded model of .
Notice that if is a satisfiable constraint then may be added to the nodes of the tree (maybe with renaming of variables ). Hence is an answer for . To construct the t-tree of case 2 we use .
A tu-tree for , with atom selected in the leaf:
From the empty cross-section a negative answer true is obtained. So is false in the well-founded semantics of the program. Similarly, true is a negative answer for , where is an arbitrary constraint.
Various simplifications of t- (tu-) trees are possible. For instance in case 3 of the last example the nodes of the tree may be replaced by . This issue is outside of the scope of this paper.
We do not deal here with actual implementing of the operational semantics. (An implementation is described in [DHM07b].) We only mention that – similarly as in CLP – it is not necessary to check satisfiability of the constraint for each node. The answers (negative answers) of trees obtained in this way are logically equivalent to those of t-trees (tu-trees) from Def. 4.2.
In this section we prove soundness of the operational semantics of hybrid programs (Def. 4.2) with respect to their declarative semantics (Def. 3.2). Before the actual proof we discuss ground instances of goals and trees, and introduce safe programs and goals. These notions are employed in the proof.
For our proofs we use the characterization of the well-founded semantics of logic programs from Section 2.1. So for a given model of the external theory, the well-founded model of the program is for some .
4.3.1 Ground instances of trees
By an extension of a substitution we mean any substitution of the form (where and are substitutions with disjoint domains, ).
By a grounding substitution for the variables of a formula (or just “for ”) we mean a substitution replacing the free variables of by ground terms. (The domain of the substitution may include other variables.)
Let be a goal and a model of . Let be a grounding substitution for the variables of . (Notice that has no free variables.) If is true in then we say that is applicable to (w.r.t. ), and by the result of applying to we mean the ground goal