Exploiting Answer Set Programming with External Sources for Meta-Interpretive Learning

by   Tobias Kaminski, et al.
TU Wien

Meta-Interpretive Learning (MIL) learns logic programs from examples by instantiating meta-rules, which is implemented by the Metagol system based on Prolog. Viewing MIL-problems as combinatorial search problems, they can alternatively be solved by employing Answer Set Programming (ASP), which may result in performance gains as a result of efficient conflict propagation. However, a straightforward ASP-encoding of MIL results in a huge search space due to a lack of procedural bias and the need for grounding. To address these challenging issues, we encode MIL in the HEX-formalism, which is an extension of ASP that allows us to outsource the background knowledge, and we restrict the search space to compensate for a procedural bias in ASP. This way, the import of constants from the background knowledge can for a given type of meta-rules be limited to relevant ones. Moreover, by abstracting from term manipulations in the encoding and by exploiting the HEX interface mechanism, the import of such constants can be entirely avoided in order to mitigate the grounding bottleneck. An experimental evaluation shows promising results.



There are no comments yet.


page 1

page 2

page 3

page 4


Compilation of Aggregates in ASP

Answer Set Programming (ASP) is a well-known problem-solving formalism i...

A Machine Learning guided Rewriting Approach for ASP Logic Programs

Answer Set Programming (ASP) is a declarative logic formalism that allow...

On the Foundations of Grounding in Answer Set Programming

We provide a comprehensive elaboration of the theoretical foundations of...

Partial Compilation of ASP Programs

Answer Set Programming (ASP) is a well-known declarative formalism in lo...

Constraint Answer Set Programming without Grounding

Extending ASP with constraints (CASP) enhances its expressiveness and pe...

Optimizing Answer Set Computation via Heuristic-Based Decomposition

Answer Set Programming (ASP) is a purely declarative formalism developed...

Best-Effort Inductive Logic Programming via Fine-grained Cost-based Hypothesis Generation

We describe the Inspire system which participated in the first competiti...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Recently, Meta-Interpretive Learning (MIL) [Muggleton et al. (2015)] has attracted a lot of attention in the area of Inductive Logic Programming (ILP). The approach learns definite logic programs from positive and negative examples given some background knowledge by instantiating so-called meta-rules. The latter can be viewed as templates specifying the shapes of rules that may be used in the induced program. The formalism is very powerful as it enables predicate invention, i.e. to use new predicate symbols in the induced program, and supports learning of recursive programs, while the hypothesis space can be constrained effectively by using meta-rules.

MIL has been implemented in the Metagol system [Cropper and Muggleton (2016b)], which is based on a classical Prolog meta-interpreter. The system is very efficient by exploiting the query-driven procedure of Prolog to guide the instantiation of meta-rules in a specific order. In contrast (and complementary) to a common declarative bias in ILP which constrains the hypothesis space, this constitutes a procedural bias that may affect efficiency (or even termination).

While traditionally most ILP systems are based on Prolog, the advantages of Answer Set Programming (ASP) [Gelfond and Lifschitz (1991)] for ILP were recognized and several ASP-based systems have been developed, e.g. [Otero (2001), Ray (2009), Law et al. (2014)]. Some benign features of ASP are its pure declarativity, which allows to modularly restrict the search space by adding rules and constraints to an encoding without risking non-termination, and that enumeration of solutions is easy. Furthermore, the efficiency and optimization techniques of modern ASP solvers such as Clasp [Gebser et al. (2012)], which supports conflict propagation and learning, can be exploited. MuggletonLPT14 already considered an ASP-version of Metagol, which used only one specific meta-rule and was tailored to inducing grammars. The authors observed that ASP can have an advantage for MIL over Prolog due to effective pruning, but that it performs worse when the background knowledge is more extensive or only few constraints are present.

Implementing general MIL by ASP comes with its own challenges; and solving MIL-problems efficiently by utilizing a straightforward ASP encoding turns out to be infeasible in many cases. The first challenge is the large search space as a result of an unguided search due to a lack of procedural bias. Consequently, the search space must be carefully restricted in an encoding in order to avoid many irrelevant instantiations of meta-rules. The second and more severe challenge concerns the grounding bottleneck of ASP: in contrast to Prolog, where only relevant terms are taken into account by unification, all terms that possibly occur in a derivation from the background knowledge must be considered in a grounding step. Finally, a third challenge are recursive manipulations of structured objects, such as strings or lists, that are common for defining background knowledge in Metagol and easy to realize in Prolog, but are less supported in ASP.

In this paper, we meet the mentioned challenges for a class of MIL-problems that is widely encountered in practice, by developing a MIL-encoding in ASP with external sources, specifically in the hex-formalism [Eiter et al. (2016)]. hex-programs extend ASP with a bidirectional information exchange between a program and arbitrary external sources of computation via special external atoms, which may introduce new constants into a program by so-called value invention.

After introducing the necessary background on MIL and hex-programs in Section 2, we proceed to present our contributions as follows:

  • We introduce in Section 3 our novel MIL approach based on hex-programs for general MIL-problems. In the first encoding, , we restrict the search space by interleaving derivations at the object level and the meta level such that new instantiations of meta-rules can be generated based on pieces of information that are already derived wrt. partial hypotheses of rules. Furthermore, we outsource the background knowledge and access it by means of external atoms, which enables the manipulation of complex objects such as strings or lists.

  • We then define the class of forward-chained MIL-problems, for which the grounding can be restricted. Informally, in such problems the elements in the binary head of a rule must be connected via a path , , …, in the body, where and . This allows us to guard the import of new terms from the background knowledge in a second encoding, , by using already imported terms in an inductive manner.

  • A large number of constants may still be imported from the background knowledge. We thus develop in Section 4 a technique to abstract from object-level terms in a third encoding, , by externally computing sequences of background knowledge atoms that derive all positive examples, and by checking non-derivability of negative examples with an external constraint.

  • In Section 5, we present results of an empirical evaluation based on known benchmark problems; they provide evidence for the potential of using a hex-based approach for MIL.

While our encoding is inspired by the implementation presented in [Muggleton et al. (2014)], to the best of our knowledge, a general implementation of MIL using ASP has not been considered in the literature so far, and neither strategies to compensate for the missing procedural bias nor to mitigate grounding issues have been investigated. Despite the use of the hex formalism, our results may be applied to other ASP formalisms and approaches as well. Proof sketches and details on the benchmark encodings used in Section 5 can be found in the appendix.

2 Background

We assume a finite set of predicate symbols, a finite set of constant symbols, and disjoint sets and of first-order and higher-order variables, resp., not overlapping with and . An atom of the form , where for , is first-order if and higher-order if ; its arity is . A ground atom is a first-order atom where for all . We represent interpretations over the Herbrand base by sets of ground atoms, and an interpretation models a ground atom , denoted , if .

A (disjunctive) logic program is a set of rules of the form


where each , , and each , , is a first-order atom. Given a rule , we call the head of , the positive body of , and the negative body of . A rule is a fact if , a constraint if , and definite if and . A definite program is a logic program that contains only definite rules.

The grounding of a rule is obtained as usual; the grounding of a program is . An interpretation models a ground rule , denoted , if for some , whenever for all and for all . An interpretation models a logic program , denoted , if for all ; and a definite program entails a ground atom , denoted , if for every s.t.  it holds that .

Meta-Interpretive Learning. The Meta-Interpretive Learning (MIL) approach by MuggletonLT15 learns definite logic programs from examples by instantiating so-called meta-rules. Here, we focus on meta-rules of the form


where , , , and , , are higher-order variables, and ,,,, , and , , are first-order variables s.t.  and also occur in the body. That is, we consider meta-rules with binary atoms in the head and with binary and/or unary atoms in the body. Meta-rules with unary head atoms can be simulated by using atoms of the form , and we allow meta-rules of arbitrary (finite) length, such that the program class is covered (cf. CropperM14, CropperM14). A meta-substitution of a meta-rule is an instantiation of where all higher-order variables are substituted by predicate symbols.111 Even though we do not consider constants in meta-substitutions, they can easily be simulated by using e.g. a dedicated atom in the body, where is defined in the background knowledge and binds to a specific constant. Examples of concrete meta-rules with names as used by CropperM16 are shown in Figure 1.

Precon: Postcon:
Chain: Tailrec:
Figure 1: Examples of Meta-Rules

We are now ready to formally introduce the setting of MIL, adapted to our approach.

Definition 1

A Meta-Interpretive Learning (MIL-) problem is a quadruple , where

  • is a definite program, called background knowledge (BK);

  • and are finite sets of binary ground atoms called positive resp. negative examples;

  • is a finite set of meta-rules.

We say that is extensional if it contains only ground atoms. A solution for is a hypothesis consisting of a set of meta-substitutions of meta-rules in s.t.  for each and for each .

In order to obtain solutions that generalize well to new examples, by Occam’s Razor simple solutions to MIL-problems are desired; thus Metagol computes a minimal solution containing a minimal number of meta-substitutions (i.e. rules).

Example 1

Consider the MIL-problem , with , , , abbreviating , and , and meta-rules . A minimal solution for is , where is an invented predicate intuitively representing the concept .

MuggletonLT15 showed that MIL-problems as in Definition 1 are decidable if no proper function symbols (i.e., only constants) are used, and and are finite, but are undecidable in general. Yet, in practice, complex terms such as lists are often used for MIL. Hence, we assume some suitable restriction, e.g. to consider only a finite set of flat lists, s.t. in slight abuse of notation, complex ground terms (e.g., ) are technically regarded as constants in .

hex-Programs. For solving MIL-problems, we exploit the hex formalism [Eiter et al. (2016)] in our approach. hex-programs extend disjunctive logic programs by external atoms, which can occur in rule bodies. External atoms are of the form , where , , are input parameters, and , , are output parameters. The semantics of a ground external atom with input and output parameters wrt. an interpretation is determined by a -ary (Boolean) oracle function such that iff . In practice, oracle functions are usually realized as plugins provided to a solver, implemented in or code.

hex-programs are interpreted under the answer set semantics [Gelfond and Lifschitz (1991)] based on the FLP-reduct by flp2011-ai (a variant of the well-known GL-reduct), which for an interpretation is . An interpretation is an answer set of a hex-program if is a subset-minimal model of .

Example 2

Consider the hex-program , and suppose that the oracle function evaluates to iff and are ground lists and can be obtained from by removing the first list element. The single answer set of is .

Note that in Example 2, the output of the external atom contains constants not occurring in and are thus introduced by the external source by so-called value invention. By employing suitable safety conditions, it can be ensured that only finitely many new constants must be considered. For more details we refer to [Eiter et al. (2016)]. As hex allows for predicate input to external atoms, their semantics may depend on the extension of predicates in an answer set.

Example 3

Consider the hex-program , and suppose that the oracle function evaluates to iff . Without the constraint, has the two answer sets and , whereby the constraint eliminates the second one.

3 hex-Encoding of Meta-Interpretive Learning

In this section, we introduce our main encoding for solving general MIL-problems, where the BK is stored externally and interfaced by means of external atoms. Subsequently, we present a modification of the encoding which reduces the number of constants that need to be considered during grounding in case only a certain type of meta-rules is used.

A major motivation for developing an ASP-based approach to solve MIL-problems is that constraints given by negative examples can be efficiently propagated by an ASP solver, while Metagol checks them only at the end. This can be shown by simple synthetic examples; e.g. consider the BK of facts , and , for . For the positive examples , …, and the negative example , Metagol finds no solution within one hour using meta-rule . In contrast, the problem can be solved by a simple ASP encoding instantly. The reason is that e.g.  can only be derived by the rule given the negative example , and Metagol explores a huge number of rule combinations before this is detected.

While the issue of negative examples can be tackled by using ordinary ASP, we employ here hex-programs as they enable us to outsource the BK from the encoding. This allows us to conveniently specify intensional BK using, e.g. string or list manipulations, which are usually not available in ASP. Another advantage of outsourcing the BK is that the approach becomes parametric wrt. the formalization of the BK, as it is in principle possible to plug in arbitrary (monotonic) external theories (e.g. a description logic ontology). Beyond this flexibility provided by hex, external atoms are essential to limit the BK that is imported as described in the latter part of this section, and for realizing our state abstraction technique in Section 4.

As we consider meta-rules using unary and binary atoms, we introduce external atoms for importing the relevant unary and binary atoms that are entailed by the BK in an encoding.

Definition 2

Given a MIL-problem , we call the external atom unary BK-atom and binary BK-atom, where the associated oracle functions fulfill iff , respectively iff .

The BK-atoms receive as input the extension of the predicate , which represents the set of all atoms that can be deduced from the program that results from the meta-substitutions of the current hypothesis. Their output constants represent unary, resp., binary atoms that are entailed by the BK augmented with the atoms described by .

In theory, MIL can be encoded by applying the well-known guess-and-check methodology, i.e. by generating all combinations of meta-substitutions from the given meta-rules and available predicate symbols, deriving all entailed atoms, and checking compatibility with examples using constraints. However, this results in a huge search space due to the many possible combinations of meta-substitutions, on top of many meta-substitutions that can be generated by different combinations of predicate symbols. At the same time, a large fraction of meta-substitutions is irrelevant for inducing a hypothesis as the resulting rule bodies can never be satisfied based on atoms that are deduced using other rules from the hypothesis and the BK.

For this reason, we interleave guesses on the meta level and derivations on the object level, i.e. deductions using meta-substitutions already guessed to be part of the hypothesis, and we model a procedural bias ensuring that meta-substitutions can only be added if their body is already satisfied by atoms deducible on the object level. Note that while Metagol’s top-down mechanism effects that only meta-substitutions necessary for deriving a goal atom are generated, our approach works bottom-up such that the procedural bias is inverted. Guarding the guesses of meta-substitutions in this way has not been considered by MuggletonLPT14; this constitutes the basis for techniques that restrict the size of the grounding discussed later on.

As in the Metagol implementation of MIL [Muggleton et al. (2015)], given a MIL-problem , we associate each meta-rule with a unique identifier and a set of ordering constraints ; and we assume a predefined total ordering over the predicate symbols in . The ordering constraints can be utilized to constrain the search space, and are necessary in Metagol in order to ensure termination. A meta-substitution of a meta-rule with head predicate instantiated for the higher-order variable satisfies the ordering constraints in case for every binary body predicate instantiated for a higher-order variable s.t. . Here, we apply ordering constraints only to pairs of head and body predicates, but in general this can be extended to arbitrary pairs of predicates in a meta-substitution. Moreover, we assume that a set of Skolem predicates can be used for predicate invention, where no element in occurs in .

We are now ready to present our main encoding for solving MIL-problems using hex.

Definition 3

Given a MIL-problem and a finite set of Skolem predicates , let be the set that contains each and each predicate symbol that occurs either in or in a rule head in . The hex-MIL-encoding for is the hex-program containing

  • a fact for each , and a fact for all s.t. 

  • the rules  and
    the rules 

  • for each meta-rule
    and ,

    • a rule

    • and a rule

  • a constraint   for each , and
    a constraint   for each

In the encoding, the predicate contains meta-substitutions added to an induced hypothesis, and captures all atoms that can be deduced from a guessed hypothesis together with the BK. As we consider examples to be binary atoms and only binary atoms can be derived from meta-substitutions, those binary atoms entailed by the BK are directly derived to be in the extension of , while unary atoms can only be derived from the BK s.t. they do not need to be added to the extension of and are imported via the predicate in item (2).

Item (3) constitutes the core of the encoding, which contains the meta-level guessing part (a) and the object-level deduction part (b). A meta-substitution can be guessed to be part of the hypothesis only if first-order instantiations of its body atoms can already be deduced, i.e. only if it is potentially useful for deriving a positive example. At this, predicate names must be from the signature and the ordering constraints must be satisfied as stated by the facts in item (1). Finally, item (4) adds the constraints imposed by the positive and negative examples.

For a given MIL-problem, solutions constituted by induced logic programs can directly be obtained from the answer sets of the respective hex-MIL-encoding. The induced logic program represented by the -atoms in an interpretation is extracted as follows:

Definition 4

For a set of meta-rules , the logic program induced by a given interpretation consists of all rules obtained from an atom of the form in such that the meta-rule is in , by substituting by , by for , and by for .

In the following, we assume that is given by the respective MIL-problem at hand.

Every answer set of a hex-MIL-encoding encodes a solution for the respective MIL-problem, and all solutions that only contain productive rules, i.e. rules such that all atoms in the body of some ground instance is entailed by , can be generated in this way.

Theorem 1

Given a MIL-problem , (i) if is an answer set of , the logic program induced by is a solution for ; and (ii) if is a solution for s.t. all rules in satisfy and are productive, then there is an answer set of s.t.  is the logic program induced by .

Although the general hex-MIL-encoding in Definition 3 works well when only a small number of constants is introduced by the BK-atoms, the grounding can quickly become prohibitively large when many constants are generated (e.g. due to list operations). This results from the fact that constants produced by item (2) in Definition 3 are also relevant for instantiating the rules defined in items (3a) and (3b), which contain many variables, causing a combinatorial explosion.

Example 4

Consider a MIL-problem , containing BK , and the positive examples . Here, the definition of the BK should be read as an abbreviation for a set of facts, e.g. containing , , etc., exploiting the list notation of Prolog. Accordingly, the predicate drops the first element from a list, and a corresponding hypothesis intuitively needs to remove the first two elements from the list in the first argument of an example to yield the second one.

Now, assume that contains lists with letters from the set up to some length . Then, the BK contains, e.g., , , etc., up to length , which are imported via the BK-atoms. However, lists containing the letter are irrelevant wrt.  because they cannot be obtained from lists appearing in the examples using the operations in the BK.

Next, we introduce a class of meta-rules that allows us to restrict the number of constants imported from the BK, based on the observation from the previous example.

Definition 5

A forward-chained meta-rule is of the form

where , , and for all . A MIL-problem is forward-chained if only contains forward-chained meta-rules.

Intuitively, all first-order variables in the body of a forward-chained meta-rule are part of a chain between the first and second argument of the head atom. Viewing binary predicates in the BK as mappings from their first to their second argument, only atoms from an extensional BK are relevant that occur in a chain between the first and the second argument of examples. Hence, atoms from the BK only need to be imported when their first argument occurs in the examples or in a deduction wrt. BK that has already been imported. However, when the derivable BK depends on guessed meta-substitutions, additional atoms might be relevant, and thus, we only consider extensional BK in the following.

For restricting the import of BK, we introduce a modification of the external atoms from Definition 2, where the output is guarded by an input constant.

Definition 6

Given a forward-chained MIL-problem where is extensional, we call the external atoms and unary and binary forward-chained BK-atom, resp., where iff , resp., iff .

As we assume the BK to be extensional, the input parameter is not needed for forward-chained BK-atoms. Based on the previous definition, we can modify our hex-MIL-encoding such that only relevant atoms from the BK are imported, where forward-chained BK-atoms receive as input all constants that already occur in a deduction or the examples.

Definition 7

Given a forward-chained MIL-problem where is extensional, the forward-chained hex-MIL-encoding for is the hex-program containing items (1), (3) and (4) from Definition 3, and the rules []2

  • for each

The main difference between and is that the import of BK is guarded by the predicate in items (f1) and (f2), whose extension contains all constants appearing as first argument of an example, due to item (f3), and all constants that appear in deductions based on the already imported BK, due to item (f4).

Every answer set of the forward-chained hex-MIL-encoding still corresponds to a solution of the respective MIL-problem, but not all solutions may be obtained. Nonetheless, it is ensured that a minimal solution (i.e., with fewest meta-substitutions) is encoded by some answer set:

Theorem 2

Let be a forward-chained MIL-problem with extensional . Then, (i) for every answer set of , the logic program induced by is a solution for ; and (ii) there is an answer set of s.t. the logic program induced by is a minimal solution for if one exists.

Since, in practice, we employ iterative deepening search for computing a minimal solution, any minimal solution encoded by an answer set of is guaranteed to be found. Thus, we can obtain minimal solutions while grounding issues are mitigated by steering the import of BK. An additional search space reduction results from the pruning of the grounding.

4 State Abstraction

Based on the observation that operations represented by binary BK predicates can be applied sequentially when only forward-chained meta-rules are used, we introduce in this section a further technique that eliminates object-level constants from the encoding entirely. While the -encoding focuses the import of constants to those obtainable from constants that already occur in deductions, the number of relevant constants can still be large if many binary BK atoms share the first argument; and all of them must be considered during grounding. However, only one BK atom is needed for each element in a chain that derives a positive example by connecting and . In fact, the -encoding solves two problems at the same time: (1) finding sequences of binary BK predicates that derive positive examples; and (2) inducing a (minimal) program that calls the predicates in the respective sequences, and prevents the derivation of negative examples.

Example 5

Consider the MIL-problem where contains the extension of from Example 4, and extensional BK represented by and . Furthermore, let , , and . Intuitively, a solution program needs to memorize and delete the rest; this requires to repeatedly switch the first two elements and remove the first element. For success, the input list must have at position 2. This is captured by the hypothesis , where is an invented predicate; this is in fact a minimal solution for . In addition, any program which enables derivations that alternate between calling and and prevents to derive the negative example using as a guard would be a solution. Notably, the search space of Metagol also contains hypotheses that have no alternation between and and thus cannot be solutions.

The previous example illustrates that the derivability of positive examples depends on the sequences by which binary BK predicates are called in the induced program. Here, finding a correct sequence for a given example can be viewed as a planning problem, where object-level constants represent states, binary BK predicates are viewed as actions, and unary BK predicates constitute fluents. The state abstraction technique described in the sequel exploits the insight that the tasks of (1) solving the planning problem and (2) finding a matching hypothesis can be separated, where the hex-program encodes task (2), and computations involving states are performed externally. The advantage of task separation and state abstraction increases with the number of actions that are applicable in a state, as usually more actions not occurring in a derivation of a positive example can be ignored; this reduces the search space and the size of the grounding.

We represent possible plans to derive positive examples by sequences of binary BK atoms. At this, cyclic sequences (or plans) have to be excluded by requiring that constants (states) occur only once because otherwise, we may obtain infinitely many sequences for a positive example:

Definition 8

Given a forward-chained MIL-problem where is extensional, the function maps each positive example to the set containing all sequences , where for all , and if .

Example 6 (cont’d)

Reconsider from Example 5. Then , ( = , = ),

In order to make information about action sequences that derive positive examples and fluents that hold in states available to the hex-encoding, we next introduce two external atoms that import such information. States are simply represented by integers in the output as their structure is irrelevant for combining sequences into a hypothesis that generalizes the plans.

Definition 9

For a forward-chained MIL-problem where is extensional, let and be unique identifiers, resp., for each and . The external atoms and are called unary and binary state abstraction (sa-)atoms, resp., where

  • iff , , and ; resp.

  • iff , , and ,

with , , and .

For instance, for from Example 5, is true, where is the identifier of the positive example, is the identifier of the sequence shown in Example 6, and the integers and represent the states and , respectively, where the second state can be reached from the first state by applying the action .

In our encoding with state abstractions we also need information about the start and end states of sequences associated with positive examples, as a hypothesis needs to encode a plan for each positive example. This information is accessed via an external atom as well.

Definition 10

For a forward-chained MIL-problem , the external atom fulfills that iff , , , and for some and .

Finally, it can only be determined wrt. the BK whether a candidate hypothesis derives a negative example, s.t. the corresponding check cannot be performed in an encoding without importing relevant atoms from the BK. As our goal is to abstract from explicit states in the BK, we also need to outsource the check for non-derivability of negative examples by an external constraint.

Definition 11

Given a MIL-problem , the oracle function associated with the external atom evaluates to iff for some , where is the logic program induced by .

In the implementation, the external atom receives information about meta-substitutions already guessed by the solver to be in the respective hypothesis. It can be evaluated to true as soon as a negative example is derivable wrt. its input, as definite logic programs are monotonic; as this may violate a constraint, backtracking in a solver can be triggered.

Example 7

Consider MIL-problem with , , , and . For , we obtain that as the negative example can be derived from ; a solver can exploit the information that cannot belong to any solution.

Utilizing the external atoms introduced in this section, we define an encoding which separates the planning from the generalization problem and contains no object-level constants.

Definition 12

Given a forward-chained MIL-problem where is extensional, its state abstraction (sa-) hex-MIL-encoding is the hex-program that contains all rules in items (1) and (3) of Definition 3, where additionally contains for each , and the rules []2

  • , for each

Items (s1) and (s2) in import the fluents for all relevant states and state transitions wrt. sequences that derive positive examples, where states are abstracted. The external atom in item (s4) imports all tuples representing the start and end state of each sequence for each positive example. The disjunctive head of (s4) enables each tuple representing a sequence to be guessed to be in the extension of the predicate , which represents all sequences that are modeled by the induced program. While a minimal hypothesis is guaranteed when the guess is over all possible sequences, in practice, we can preselect sequences returned by the atom . Moreover, the guess can be omitted if the planning problem is deterministic, i.e. if for each positive example there is exactly one sequence of binary atoms from the BK that derives its second argument from its first argument. Items (s3) and (s5) ensure that at least one sequence for each positive example is selected s.t. the corresponding end state can be derived from the start state by the induced program. Finally, (s6) and (s7) state the constraints regarding positive resp. negative examples.

As can be shown, only yields correct solutions, and a minimal one if all sequences that derive positive examples are acyclic. More formally:

Theorem 3

Let be a forward-chained MIL-problem with extensional BK . Then, (i) for every answer set of , the logic program induced by is a solution for ; and (ii) there is an answer set of s.t. the logic program induced by is a minimal solution for if one exists and every sequence of binary BK atoms that derives a positive example in is acyclic.

Hence, we have an alternative to find solutions for forward-chained MIL-problems where planning and generalization are separated in a way such that the BK can be outsourced completely.

5 Empirical Evaluation

In this section, we evaluate our approach by comparing it to Metagol in terms of efficiency.

Experimental Setup. For experimentation, we utilized an iterative deepening strategy which incrementally increases a limit for the maximal number of guessed meta-substitutions imposed via a constraint to obtain minimal solutions. In addition, we incrementally increased the number of invented predicates wrt. each limit, which proved to be beneficial for performance.

We computed answer sets of our encodings with hexlite222https://github.com/hexhex/hexlite/ 0.3.20, which is based on clingo 5.1.0. For comparison, we used SWI-Prolog 7.2.3 to run Metagol 2.2.0 [Cropper and Muggleton (2016b)]. Experiments were run on a Linux machine with 2.5 GHz dual-core Intel Core i5 processor and 8 GB RAM; the timeout was 600 seconds per instance. The results wrt. the average running times in seconds are shown in Figure 2, where error bars indicate the standard error (= sdvn for instances) per instance size. In addition, the average running times required for the grounding step are shown in Figure 3. We compared the encodings and (conditions hexmil and stateab in Figure 2, resp.) to Metagol for the first two benchmarks, and only used for the third benchmark as discussed below.

For each MIL-problem in this section, we used the meta-rules shown in Figure 1, and we implemented it in Metagol and used our hex-MIL-encodings. External atoms are realized as Python-plugins in our implementation. For operations defined by the BK, we utilized custom list manipulations. The external atoms and in employ breadth-first search for computing all sequences wrt. positive examples and for checking the derivability of negative examples, respectively. In the further development of our implementation, our goal is to employ more sophisticated planning algorithms for computing the sequences, and to interface a Prolog-interpreter for processing the BK and for checking negative examples.

The encodings for the benchmark problems and all instances used in the experiments are available at http://www.kr.tuwien.ac.at/research/projects/inthex/hexmil/.

String Transformation (B1). Our first benchmark is based on Example 5, and akin to inducing regular grammars as considered by MuggletonLPT14. Learning grammars is a suitable use case for MIL as it enables recursive string processing and predicate invention to represent substrings. In contrast to MuggletonLPT14, we also allow switching the first two letters in a string in addition to removing elements, which increases the search space and makes conflict propagation and state abstraction more relevant. For the instances used by MuggletonLPT14, Metagol performs much better due to limited branching in the search space. We used positive and negative examples of the form , where is a random sequence of letters and . The predicates in the BK are , , , and (cf. Example 5). We used problems containing one positive and one negative example of the same length, and tested lengths . We report average runtimes of 20 randomly generated instances per .

Figure 2: Benchmark results for String Transformation, East-West Trains and Robot Strategies (left to right). Average overall running times in seconds are shown on the y-axis, and instance size is shown on the x-axis.
Figure 3: Grounding times. Dashed lines indicate the average running times required for grounding in seconds, the solid line in the rightmost diagram shows the overall running times of benchmark B3 for comparison. Overall running times are not shown for benchmarks B1 and B2 as they are much larger than the grounding times. Grounding in condition was infeasible for benchmark B3.

East-West Trains (B2). The East-West train challenge

by LarsonM77 is a popular ILP-benchmark. The task is to learn a theory that classifies trains based on features (e.g. shapes of cars and types of loads) to be either east- or westbound. In our benchmark, eastbound trains are positive and westbound trains negative examples, where trains are represented by lists. The BK defines the operation

which removes the first car from a train; and we declare 50 different unary predicates, e.g.  or , for checking properties of the remaining part of a train. We used a data set of 10 eastbound and 10 westbound trains proposed by michie1994international that was also considered by MuggletonLT15. We generated instances of size by randomly selecting from the 20 trains, s.t.  were eastbound, and averaged the running times of 10 instances for each problem size.

Robot Waiter Strategies (B3). For our final experiment, we used a problem by CropperM16 that consists in learning robot strategies: customers sit at a table in a row, and a waiter robot serves each customer her desired drink, which is either tea or coffee. Initially, the robot is at the left end of the table and each customer has an empty cup. In the goal state, each cup contains the desired drink and the robot is at the right end of the table. States are represented using lists, and positive examples map an initial state to a goal state considering different numbers of customers and preferences for drinks. The actions are defined by binary BK predicates , and , and the fluents by unary BK predicates , and .333In contrast to CropperM16, we omitted the action , as otherwise we obtained timeouts for the majority of instances and all conditions, as it is also the case for Metagol in [Cropper and Muggleton (2016a)]. A solution constitutes a planning strategy by generalizing a plan for each positive example.

For this benchmark, solutions are constrained to be functional, i.e. to map an initial state only to the unique respective goal state and not to any non-goal state. Accordingly, negative examples are implicitly given by all binary atoms that map an initial state to a non-goal state. In Metagol, solutions can be restricted to functional theories by means of a property declaration, and we also integrated a corresponding check in the implementation for the external atom .

We generated random instances similar to CropperM16, where each positive example has a random number of customers with random drink preferences, and the instance size is measured in terms of the number of positive examples ranging from 1 to 8. For each instance size we averaged the running times of 20 problem instances.

Findings. Regarding (B1) and (B2), we found that instances can be solved significantly faster by employing than by Metagol due to conflict propagation in ASP. The encoding performed similar to Metagol since only two binary predicates, resp. one, are defined by the BK s.t. solving the planning problem externally does not yield a significant advantage, and the advantage of efficient conflict propagation in ASP is outweighed by the overhead that goes along with outsourcing constraints for negative examples in . performs slightly better in (B1), where two actions are available instead of only one in (B2).

For (B3), we did not obtain results by using for many instances as the grounding was too large due to the imported BK. For instance size 5, the import from the BK already consumed around 100 MB of memory due to the high number of states, and the grounding of the encoding exceeded the available memory. However, the grounding problem could effectively be avoided by using state abstractions with , which yielded a significant speed-up compared to Metagol. This is due to the fact that by using , the planning problem is split from the generalization problem such that only one precomputed plan per positive example is considered, which greatly reduced the search space. Overall, the performance could be improved by one of our encodings wrt. Metagol in all benchmarks, whereby state abstraction was crucial when many different actions are defined by the BK, but may decrease efficiency otherwise.

With respect to the grounding step, we found that grounding required significantly more resources in terms of running time as well as the size of the grounding than grounding , both in the case of (B1) and (B2). The reason is that only states need to be considered which occur in a sequence of binary BK atoms that derives a positive example for grounding the encoding , while also imports all constants that are potentially relevant for deriving some negative example. However, the advantage of