Extending Description Logic EL++ with Linear Constraints on the Probability of Axioms

08/27/2019 ∙ by Marcelo Finger, et al. ∙ Universidade de São Paulo 0

One of the main reasons to employ a description logic such as EL or EL++ is the fact that it has efficient, polynomial-time algorithmic properties such as deciding consistency and inferring subsumption. However, simply by adding negation of concepts to it, we obtain the expressivity of description logics whose decision procedure is ExpTime-complete. Similar complexity explosion occurs if we add probability assignments on concepts. To lower the resulting complexity, we instead concentrate on assigning probabilities to Axioms (GCIs). We show that the consistency detection problem for such a probabilistic description logic is NP-complete, and present a linear algebraic deterministic algorithm to solve it, using the column generation technique. We also examine and provide algorithms for the probabilistic extension problem, which consists of inferring the minimum and maximum probabilities for a new axiom, given a consistent probabilistic knowledge base.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The logic  is one of the most expressive description logics in which the complexity of inferential reasoning is tractable [Baader, Brandt, and LutzBaader et al.2005a]. A direct consequence of this expressivity is that, by adding extra features to this language, its complexity easily grows exponentially. By inferential complexity we mean the complexity of decision problems such as consistency detection, finding a model that satisfies a set of constraints, or Axiom subsumption. All such problems are tractable in .

In this work we are interested in adding probabilistic reasoning capabilities to ; however, depending on how those reasoning capabilities are added to the language, the inferential complexity can explode beyond exponential time. As shown in Section 3.1, by extending  with probabilistic constraints over concepts, inferential reasoning becomes ExpTime-hard. Such an approach was employed in many times in the literature, either by enhancing expressive description logics such as  [HeinsohnHeinsohn1994, LukasiewiczLukasiewicz2008, Gutiérrez-Basulto, Jung, Lutz, and SchröderGutiérrez-Basulto et al.2011, Jung, Gutiérrez-Basulto, Lutz, and SchröderJung et al.2011], or by adding probabilistic capabilities to the family of -like logics [Lutz and SchröderLutz and Schröder2010, Gutiérrez-Basulto, Jung, Lutz, and SchröderGutiérrez-Basulto et al.2017].

In this work, we study a different way of extending description logics with probabilistic reasoning capabilities, namely by applying probabilities to GCI Axioms. One of our goals is to reduce the complexity of probabilistic reasoning in description logics. Another goal is to deal with the modelling situation in which a GCI Axiom is not always true, but one can assign (subjectively) a probability to its validity. Consider the following example describing one such situation.

Example 1

Consider the following medical situation, in which a patient may have symptoms which are caused buy a disease. However, some diseases cause only very nonspecific symptoms, such as high fever, skin rash and joint pain, which may also be caused by several other diseases. Dengue is one such desease with mostly nonspecific symptoms. Dengue is a mosquito-borne viral disease and more than half of the world population lives at risk of contracting it. Among its symptoms are high fever, joint pains and skin eruptions (rash). These symptoms are common but not all patients present all symptoms. Such an uncertain situation allows for probabilistic modelling.

In a certain hospital, joint pains are caused by dengue in 20% of the cases; in the remaining 80% of the cases, there is a patient whose symptoms include joint pains whose cause is not attributable to dengue. Also, a patient having high fever has some probability having dengue, which increases 5% if the patient also has a rash. If those probabilistic constraints are satisfiable, one can also ask the minimum and maximum probability that a given patient is a suspect of suffering from dengue.

By adding probability constraints to axioms, we hope to model such a situation. Furthermore we will show that the inferential complexity in this case remains “only” NP-complete. In fact, our approach extends some previous results which considered adding probabilistic capabilities only to ABox statements [Finger, Wassermann, and CozmanFinger et al.2011]. By using  as the underlying formalism, ABox statements can be formulated as a particular case of GCI Axioms, so the approach here has that of [Finger, Wassermann, and CozmanFinger et al.2011] as a particular case, but with inferential reasoning remaining in the same complexity class.

The rest of the paper proceeds as follows. Section 2 presents the formal -framework and Section 3 introduces probabilities over axioms, and define the probabilistic satisfiability and probabilistic extension problems. Section 4 presents an algorithm for probabilistic satisfiability that combines -solving with linear algebraic methods, such as column generation. Finally, Section 5 presents an algorithm for the probabilistic extension problem, and then we present our conclusions in Section 6.

2 Preliminaries

We concentrate on the description language  without concrete domains [Baader, Brandt, and LutzBaader et al.2005a]. We start with a signature consisting of a triple of countable sets where is a set of concept names, is a set of role names and is a set of individual names. The basic concept description are recursively defined as follows:

  • , and concept names in are (simple) concept descriptions;

  • if are concept descriptions, is a (conjunctive) concept description;

  • if is a concept description and , is an (existential) concept description;

  • if , is a (nominal) concept description;

If are concept descriptions an axiom, also called a general concept inclusion (GCI), is an expression of the form . If then is a role inclusion (RI). A finite set of axioms is called a TBox and a finite set of axioms and RIs is called a constraint box (CBox).

A concept assertion is an expression of the form , where and is a concept description; a role assertion is an expression of the form , where and . A finite set of concept and role assertions forms an assertion box (ABox).

Semantically, we consider an interpretation . The domain is a non-empty set of individuals and the interpretation function maps each concept name to a subset , each role name to a binary relation and each individual name to an individual . The extension of to arbitrary concept descriptions is inductively defined as follows.

  • , ;

  • ;

  • ;

  • .

The interpretation satisfies an axiom if (represented as ); the RI is satisfied by (represented as ) if . A model satisfies the assertion (represented as ) if and satisfies the assertion (represented as ) if . Given a CBox , we write if for every axiom and for every role inclusion in . Similarly, given an ABox , we write if satisfies all its assertions.

Given a CBox , we say that it logically entails an axiom , represented as , if for every interpretation we have that .

Note that in  there is no need for an explicit ABox, for we have that iff ; and iff .

Given a CBox, one of the important problems for  is to determine its consistency, namely the existence of a common model which jointly validates all expressions in the CBox. There is a polynomial algorithm which decides -consistency [Baader, Brandt, and LutzBaader et al.2005b].

This decision process can be used to provide a PTIME classification of an  CBox. Given a CBox , the set of basic concepts descriptions for is given by

Example 2

Consider a CBox representing the situation described in Example 1; this modelling is adapted from [Finger, Wassermann, and CozmanFinger et al.2011].

The following TBox describes basic knowledge on deseases: High-fever Symptom Joint-pain Symptom Rash Symptom Dengue Disease Symptom hasCause.Disease Patient suspectOf.Disease Patient hasSymptom.Symptom hasSymptom.(hasCause.Dengue) suspectOf.Dengue   And the following ABox presents John’s symptoms. Patient(john) john Patient High-fever() High-fever hasSymptom(john, ) johnhasSymptom. Joint-pain() Joint-pain hasSymptom(john, ) johnhasSymptom.

Note that the uncertain information on dengue and its symptoms is not represented by the CBox above.

3 Extending  with Probabilistic Constraints

One of the main reasons to employ a description logic such as  is the fact that it has polynomial-time algorithmic properties such as deciding and inferring subsumption. However, it is well known that simply by adding negation of concepts to , we obtain the expressivity of description logic   whose decision procedure is ExpTime-complete [Baader, Horrocks, Lutz, and SattlerBaader et al.2017]. This complexity blow up can also be expected when adding probabilistic constraints.

3.1 Why Not Assign Probability to Concepts?

When we are dealing with probabilistic constraints on description logic, one of the first ideas is to apply conditional or unconditional probability constraints to concepts. In fact, such an approach was employed in several enhancements of description logics with probabilistic reasoning capabilities, e.g. as [HeinsohnHeinsohn1994, LukasiewiczLukasiewicz2008, Lutz and SchröderLutz and Schröder2010, Gutiérrez-Basulto, Jung, Lutz, and SchröderGutiérrez-Basulto et al.2017].

However, one can see how such an approach would lead to problems if applied to . For each concept one can define an associated concept subject to the following constraints:

Without going into the (non-trivial) semantic details of concept probabilities, it is intuitively clear that those statements force to be the negation of . In fact, the first statement expresses that and are complementary and the second statement expresses that they are disjoint; together they mean that interpretation of and form a partition of the domain, and thus is the negation of . As a consequence, the expressivity provided by probabilities over concepts adds to  the expressivity of , and as a consequence the complexity of deciding axiom subsumption becomes ExpTime-hard. Detailed complexity analysis can be found in [Gutiérrez-Basulto, Jung, Lutz, and SchröderGutiérrez-Basulto et al.2017].

To lower the resulting complexity, we refrain from assigning probabilities to concepts and instead concentrate on assigning probabilities to axioms.

3.2 Probability Constraints over Axioms

Assume there is a finite number of interpretations, ; let be a mapping that attributes to each a positive value such that .

Then given an axiom , its probability is given by:

(1)

Note that this definition contemplates the probability of ABox elements; for example the probability .

Given axioms and rational numbers , a probabilistic constraint consist of the linear combination:

(2)

where . A PBox is a set of probabilistic constraints. A probabilistic knowledge base is a pair , where is a CBox and a PBox. Note that the axioms occurring in the PBox need not occur in the CBox, and in general they do not occur in it.

The intuition behind the probability of a GCI can perhaps be better understood if seen by its complement. So the probability of an axiom is if the probability of its failure is , that is, the probability of finding a model in which there exists an individual that is in concept but not in concept , and . Under this point of view, if there is a probability of finding a model in which either no individual instantiates concept or all individual instances of concept are also individual instances of concept . This has as a consequence the following, somewhat unintuitive behavior: if is a “rare” concept in the sense that most models have no instances of , then the probability tends to be quite high for any , for it has as lower bound the probability of a model not having any instances of .

Note that this intuitive view also covers ABox statements, which can be expressed as axioms of the form and . But in these cases, all models always satisfy the nominal , so e.g. simply means that the probability of finding a model in which is an instance of concept is .

3.3 Probabilistic Satisfaction and Extension Problems

A probabilistic knowledge base is satisfied by interpretations

if there exists a probability distribution

over the interpretations such that

  • if then ;

  • all probabilistic constraints in hold.

This means that an interpretation can have a positive probability mass only if it satisfies CBox , and the composition of all those interpretations must verify the probability of constraints in . A knowledge base is satisfiable if there exists a set of interpretations and a probability distribution over them that satisfy it.

Definition 1

The probabilistic satisfiability problem for the logic  consists of, given a probabilistic knowledge base , decide if it is satisfiable.

Definition 2

The probabilistic extension problem for the logic  consists of, given a satisfiable probabilistic knowledge base and an axiom , find the minimum and maximum values of that are satisfiable with .

Example 3

We create a probabilistic knowledge base by extending the CBox presented in Example 2 with the uncertain information described in Example 1.

Dengue symptoms are nonspecific, so in some cases the high fever is actually caused by dengue, represented by Ax1 := High-fever hasCause.Dengue, and in some other cases we may have a combination of high fever and rash being caused by dengue, represented by Ax2 := High-fever Rash hasCause.Dengue. And the fact that joint pains are caused by dengue is represented by Ax3 := Joint-pain hasCause.Dengue. None of the axioms Ax1, Ax2 or Ax3 is always the case, but there is a probability that dengue is, in fact, the cause. The following probabilistic statements represents uncertain knowledge on the relationship between dengue and its symptoms, as observed in a hospital.

Ax2Ax1 The probability of dengue being the cause is 5% higher when both high fever and rash are symptoms, over just having high fever;
Ax3 20% of cases of joint pain are caused by dengue.

We want to know if this probabilistic database is consistent and, in case it is, we want to find upper and lower bounds for the probability that John is a suspect of having dengue, suspectOf.Dengue(john)) .

In order to provide algorithms that tackle both the decision and the extension problems, we provide a linear algebra formulation of those problems.

3.4 A Linear Algebraic View of Probabilistic Satisfaction and Extension Problems

Initially, let us consider only restricted probabilistic constraints of the form . Consider a restricted probabilistic knowledge base in which the number of probabilistic constraints is . Let

be a vector of size

of probabilistic constraint values. Consider a finite number of interpretations, , and let us build a matrix of elements such that

Note that column contains the evaluations by interpretation of the axioms submitted to probabilistic constraints. Given a CBox and sequence of axioms , a -vector of size represents a -satisfiable interpretation if , and iff for . The idea is to assign positive probability mass only if represents a -satisfiable interpretation.

Let be a vector of size representing a probability distribution. Consider the following set of constraints associated to , expressing the fact that is a probability distribution that respects the constraints given by matrix :

(3)

The fact that constraints (3) actually represent satisfiability is given by the following.

Lemma 1

A probabilistic knowledge base with restricted probabilistic constraints is satisfiable iff there is a vector that satisfies its associated constraints  (3).

When the probabilistic knowledge base is satisfiable, the number of interpretations associated to the columns of matrix may be exponentially large with respect to the number of constraints in . However, Carathéodory’s Theorem [EckhoffEckhoff1993] guarantees that if there is a solution to (3) then there is also a small solution, namely one with at most positive values.

Lemma 2

If constraints (3) have a solution then there exists a solution with at most values such that .

Now instead of considering only a restricted form of probability constraints, let us consider constraints of the form  (2) as defined in Section 3, namely

where , and .

We assume there are at most axioms mentioned in , such that if does not occur at constraint . Consider a matrix and a vector of size . We now have the following set of associated constraints to the probabilistic knowledge base , extending (3):

(4)

As before, ’s columnns are -representations of the validity of the axioms occurring in under the interpretation . Constraints (4) are solvable if there are vectors and that verify all conditions. Analogously, the solvability of constraints (4) characterize the satisfiability of probabilistic knowledge bases with unrestricted constraints.

Lemma 3

A probabilistic knowledge base is satisfiable if and only if its associated set of constraints  (4) are solvable.

Example 4

Consider four interpretations for the knowledge base described in Example 3. Interpretation satisfies CBox of Example 2 and also axioms Ax1, Ax2, Ax3. Interpretation satisfies and axioms Ax2, Ax3 but not Ax1. Interpretation satisfies and only axiom Ax3. Interpretation satisfies only but none of the axioms. We then consider a probability distribution , such that , , , . The following shows that all probabislistic restrictions are satisfied.

So Ax2Ax1 and Ax3.

When constraints (4) are solvable, vector has size , but vector can be exponentially large in . By a simple linear algebraic trick, constraints of the form (4) can he presented in the following form:

(5)

In fact, it suffices to make:

where

is the identity matrix, and

is a row of 1’s. When we say that the column represents a -satisfiable interpretation, we actually mean that the part of that corresponds to some column that represents a -satisfiable interpretation, its -initial positions are and its last element is . Note that has rows and columns. Again, Carathéodory’s Theorem guarantees small solutions.

Lemma 4

If constraints (4) have a solution then there exists a solution with at most values such that .

We now show that probabilistic satisfiability is NP-hard.

Lemma 5

The satisfiability problem for probabilistic knowledge bases is NP-hard.

Proof

We reduce SAT to probabilistic satisfiability over ; unlike PSAT111PSAT, or Probabilistic SATisfiability, consists of determining the satisfiability of a set of probabilistic assertions on classical propositional formulas [Finger and BonaFinger and Bona2011, Finger and De BonaFinger and De Bona2015, Bona, Cozman, and FingerBona et al.2014]., it does not suffice to set all probabilities to 1, as  is decidable in polynomial time. Instead, we show how to represent 3-SAT clauses (i.e. disjunction of three literals) as a set of probabilistic axioms, basically probabilistic ABox statements. For that, consider a set of propositional variables upon which the set of clauses of the SAT problem are built. On the probabilistic knowledge base side, consider a single individual and basic concepts and , subject to the following restrictions:

(6)

The idea is to represent the propositional atomic information by the axiom , its negation by , and the fact that a clause holds is represented by the probabilistic statement

(7)

Given , we build a probabilistic knowledge base by the representation (7) of the clauses in plus assertions of the form (6). We claim that is satisfiable iff is. In fact, suppose is satisfiable by valuation , make a  model such that iff and assign probability 1 to ; clearly is satisfiable. Now suppose is satisfiable, so there exists an  model which is assigned probability strictly bigger than 0. Construct a valuation such that iff . Clearly , otherwise there is a clause in such that and thus for ; then , contradicting (7).

Theorem 1

The satisfiability problem for probabilistic knowledge bases is NP-complete.

Proof

Lemma 4 provides a small witness for every problem, such that by guessing that witness we can show in polynomial time that the constraints are solvable; so the problem is in NP. Lemma 5 provides NP-hardness.

4 Column Generation Algorithm for Probabilistic Knowledge Base Satisfiability

An algorithm for deciding probabilistic knowledge base satisfiability has to provide a means to find a solution for restrictions (4) if one exists; otherwise determine no solution is possible. Furthermore, we will assume that the constraints are presented in format (3).

We now provide a method similar to PSAT-solving to decide the satisfiability of probabilistic knowledge base . We construct a vector of costs whose size is the same as size of such that , if column satisfies the following condition: either the first positions are not 0, or the next cells representing correspond to an interpretation that does not satisfy the CBox , or the last position of is not ; if is one of the last columns, or its first elements are 0 and the next elements are a representation of an interpretation that is -satisfiable and its last element is , then . Then we generate the following optimization problem associated to (3).

(8)
Lemma 6

Given a probabilistic knowledge base and its associated linear algebraic restrictions (4), is satisfiable if, and only if, minimization problem (8) has a minimum such that .

Condition means that only the columns of corresponding to -satisfiable interpretations can be attributed probability , which immediately leads to solution of (8). Minimization problem (8) can be solved by an adaptation of the simplex method with column generation such that the columns of corresponding to columns of are generated on the fly. The simplex method is a stepwise method which at each step considers a basis consisting of columns of matrix and computes its associated cost [Bertsimas and TsitsiklisBertsimas and Tsitsiklis1997]. The processing proceeds by finding a column of outside the basis, creating a new basis by substituting one of the basis columns by this new column such that the associated cost never increases. To guarantee the cost never increases, the new column to be inserted in the basis has to obey a restriction called reduced cost given by , where is the cost of column , is the basis and is the cost associated to the basis. Note that in our case, we are only inserting columns that represent -satisfiable interpretations, so that we only insert columns of matrix and their associated cost . Therefore, every new column to be inserted in the basis has to obey the inequality

(9)

Note that the first positions in are 0 and the last one is always 1.

A column representing a -satisfying interpretation may or may not satisfy condition (9). We call an interpretation that does satisfy (9) as cost reducing interpretation. Our strategy for column generation is given by finding cost reducing interpretations for a given basis.

Lemma 7

There exists an algorithm that decides the existence of cost reducing interpretations whose complexity is in NP.

Proof

Since we are dealing with a CBox in , the existence of satisfying interpretations is polynomial-time and thus in NP, we can guess one such equilibrium and in polynomial time both verify it is a -satisfying interpretation and that is satisfies (9).

We can actually build a deterministic algorithm for Lemma 7 by reducing it to a SAT problem. In fact, computing  satisfiability can be encoded in a 3-SAT formula ; the condition (9) can also be encoded by a 3-SAT formula in linear time, e.g. by Warners algorithm [WarnersWarners1998], such that the SAT problem consisting of deciding is satisfiable if, and only if, there exists a cost reducing interpretation. Furthermore its valuation provides the desired column , after prefixing it with 0’s and appending a 1 at its end. This SAT-based algorithm we call the -Column Generation Method. In practice, column generation tries first to output one of the last columns in ; if the insertion of one such column causes or , or if all the last -columns are in the basis, the proper-Column Generation Method is invoked.

Input: A probabilistic knowledge base and its associated set of restrictions in format (3).

Output: No, if is unsatisfiable. Or a solution that minimizes (8).

1:  
2:  , and
3:  while  do
4:     
5:     if Column generation failed then
6:        return  No;   {probabilistic knowledge base is unsatisfiable}
7:     else
8:        
9:        , recompute ; the costs of columns;
10:     end if
11:  end while
12:  return  ;   {probabilistic knowledge base is satisfiable}
Algorithm 1 PKBSAT-CG: a probabilistic knowledge base solver via Column Generation

Algorithm 1 presents the top level probabilistic knowledge base decision procedure. Lines 12 present the initialization of the algorithm. We assume the vector is in descending order. At the initial step we make , this forces , ; and , where if column in is an interpretation; otherwise . Thus the initial state is a feasible solution.

Algorithm 1 main loop covers lines 311 which contains the column generation strategy at beginning of the loop (line 4). If column generation fails the process ends with failure in line 6; the correctness of unsatisfiability by failure is guaranteed by Lemma 6. Otherwise a column is removed and the generated column is inserted in a process we called merge at line LABEL:liWe_have_thus_proved_the_following_result._n:merge. The loop ends successfully when the objective function (total cost) reaches zero and the algorithm outputs a probability distribution and the set of interpretations columns in , at line 12.

The procedure merge is part of the simplex method which guarantees that given a column and a feasible solution there always exists a column in such that if is obtained from by replacing column with , then there is such that is a feasible solution.

4.1 Column Generation Procedure

Column generation is based on the cost reduction condition (9), which we repeat here:

(10)

Recall that matrix is of the form

So, column generation first tries to insert a cost decreasing column from the last columns in ; this involves verifying if condition (10) holds for any of the rightmost columns, which are known from the start and do not need to be generated. If one such column is found, it is returned.

If no such column is found, however, -Column Generation Method described next is invoked. As the number of columns of matrix is potentially exponentially large and thus not stored. Note that the first positions in a generated column are all 0 and the last entry is always 1; the remaining positions are a column of matrix representing an -interpretation ; those positions are 0-1 values, where 1 represents and 0 representing the existence of some domain element such that but , . Thus the elements of a generated are all 0-1, and we identify them with valuations of a satisfying assignment of a SAT formula obtained as follows:

  1. is obtained by translating the inequality (10) into a set of clauses; this can be done, for instance, using the procedure described by [WarnersWarners1998].

  2. is a rendering of the -decision procedure as a SAT formula for the -satisfiability bt some interpretation of the given set of axioms on which linear conditions are imposed, .

Formulas and share variables indicating whether , . We take , and send it to a SAT solver. If is satisfiable, we obtain from the satisfying valuation a column that is cosat reducing, due to n and that represents an -model, due to .

As the constraints of the sumplex method are thus respected, and it is an always terminating procedure, we have the following result.

Theorem 2

Algorithm 1 decides probabilistic knowledge base satisfiability using column generation.

A detailed example is provided illustrated the procedure details.

Example 5

We now show a step-by-step solution of the satisfiability of the dengue example using Algorithm 1 and column generation procedure as above. At each step we are going to show the basis , the basis cost vector , the partial solution , the current cost and the generated column .

The columns generated correspond to -models that have to satisfy the restrictions

(11)

Each row of the basis corresponds to some restriction. Initially, the basis is the identity matrix, the basis cost vector is all 1’s, idicating that all columns do not correspond to any model satisfying (11).

As described above, column generation first tries to insert a cost decreasing column from the last columns in , which are known a priory. In our case we have the following -equalities and the corresponding columns:

where each corresponds to axioms , and , respectively. It occurs that those columns satisfy the column reduction inequality (10), and are inserted in the basis in the order , , ; also note that the rightmost column does correspond to a model satisfying restriction (11), so after 4 column generation steps we have the following state:

Note that the inserted columns now correspont to positions of basis cost vector with value 0. The choice of which columns leave the basis is performed by the merge procedure, which is a linear algebraic method that ensures that . Note that total cost has not decreased so far, which is always a possibility as condition (10) only ensures that the coat is non-increasing. As all the rightmost -columns have already been inserted in the basis, we have to proceed to a proper column generation process in which restriction (11) needs to be respected as well as the following inequality:

We transform the inequality above to a SAT formula, together with a transformation of restriction (11) into another SAT formula, and submit to a SAT solver that generates a satisfying valuation indicating that there is an -model that satisfies axioms 2 and 3 but not axiom 1, thus generating the column which the merge procedures inserts as the fourth column, thus generating the state:

Note that the total cost has decreased for the first time. The merge process chooses a column to leave the basis so as to guarantee that the partial solution , but it does not ensure that the leaving column is one with non-zero cost. In fact, it is a coincidence that in this example all columns that left the basis had non-zero cost; on he other hand, it is by construction that all entering columns have zero cost.

Finally, we proceed with column generation. As before, we obtain the inequality

which together with restriction (10) allows for a model in which the three axioms in focus are all true; as before, such a state is obtained by submitting a SAT-encoded formula to a SAT solver. We obtain the sixth step in the column generation process: