1 Introduction
While declarative systems may not be inherently more difficult to learn than, e.g., an objectoriented programming language, they are still often perceived as such, because many programmers are unfamiliar with their syntax, semantics or paradigm. This leads in turn to a chickenandegg problem, where programmers do not learn this technology, because companies are reluctant to adopt it, because there are not enough programmers who know the technology readily available.
In this paper, we propose to tackle both problems by means of an API that allows a declarative knowledge base (KB) to be used from within a wellknown imperative host language. Our first goal is to integrate KB functionality into the host language as seamlessly as possible. In this way, it should be possible to use the knowledge base to prototype a single component of a large system, without affecting the rest of the code base. The second goal is to have a very low learning curve for the API. We achieve this by using as much as possible the syntax of the host language, and by requiring no more background in declarative languages than an introductory course on classical logic. We therefore expect our API to be immediately usable by, e.g., bachelor students in a typical CS curriculum.
In more detail, we postulate these guidelines for our development of the API:

The interaction between the KB and host language should be done through standard objects of the host language.

The need to learn KBspecific terminology should be kept to a minimum.

It should be as easy as possible to replace the KB by a piece of host language code (or, vice versa, to replace a piece of host language code by a KB).
A typical use for our API will be to offload specific computational problems (e.g., detect connected components in a graph, find a permissible allocation of resources to jobs) to the KB, thereby avoiding the need to implement a specific algorithm and thus arriving at a working prototype more quickly. In such an early prototype, the modular, declarative nature of the KB will be particularly useful, because of its ability to easily cope with additional changes to the specification. Once the program has reached a certain level of maturity, it can of course be profiled to see whether all of the KB components meet the performance requirements. Whenever this is not the case, the KB can be replaced by a dedicated algorithm with minimal impact on the rest of the code.
As a host language, we use Python (in particular, version 2.7). Given our stated goals, this is the most obvious choice: “[a]t the time of writing (July 2014), Python is currently the most popular language for teaching introductory computer science courses at topranked U.S. departments.”^{1}^{1}1http://cacm.acm.org/blogs/blogcacm/176450pythonisnowthemostpopularintroductoryteachinglanguageattopusuniversities/fulltext We assume familiarity with the basics of Python throughout this paper. The KB system that we use will be discussed in Section 2. Section 3 then discusses our interface between host language and KB, which we validate by means of some examples in Section 4. In Section 5, we give some brief notes on the implementation of our API. Finally, Section 6 discusses some related work, in particular, other approaches that integrate a declarative knowledge base into an imperative language.
2 KB system
As an underlying KB system, we will use IDP (Imperative Declarative Programming)^{2}^{2}2https://dtai.cs.kuleuven.be/software/idp [bruynooghe14], which combines techniques from SAT solving, Logic Programming and Answer Set Programming (ASP). It has performed well in previous ASP competitions, e.g., narrowly finishing second after Clasp in the System Track of 2011^{3}^{3}3https://www.mat.unical.it/aspcomp2011/. IDP has a number of properties that fit well with our goals of achieving both a tight integration with the host language and a low learning curve.
Input language. IDP uses a language that is a conservative extension of classical firstorder logic (FO). Because most students of computer science are familiar with FO, this means that the learning curve for a large part of IDP’s input language consists only of learning a particular ASCII syntax for FO. One of the ways in which IDP extends FO is by adding inductive definitions [denecker08]. Because such definitions cannot, in general, be expressed in FO, this is a real extension of the language. Moreover, since inductive datatypes (lists, trees, …) are very common in computer programs, this feature will prove useful in our API.
Inference. In addition to its input language, a second useful property of IDP is that it aims to support a variety of different inference tasks. Particularly useful in the context of this article is the task of finite model expansion. As pointed out in [MitchellT05], modal expansion for FO captures the complexity class NP, thereby covering the kind of tasks that we would like to offload to a declarative KB. Moreover, [Tasharrofi] have further demonstrated that model expansion is a key task when using declarative methods to build modular software systems.
2.1 FO: syntax and semantics
We briefly recall the standard syntax and semantics of FO. A vocabulary consists of a set of function symbols, each with an associated arity , and a set of predicate symbols, also each with an arity . A function with arity 0 is called a constant. A term is either a constant, a variable or an expression where is an ary function symbol and the are terms. An atom is an expression , with an ary predicate and the again terms. A formula is either an atom or an expression , , , , or , where are formulas and is a variable. As usual, abbreviates and stands for . A sentence is a formula without free variables and a theory is a finite set of sentences.
The standard semantics of FO is defined in terms of structures for a vocabulary . Each structure consists of a domain and a mapping of:

Each ary predicate symbol in to an ary relation

Each ary function symbol in to an ary function
The satisfaction relation is defined between structures for a vocabulary and theories of this vocabulary (or a subvocabulary thereof) by the usual induction. When , we also say that the structure is a model of .
2.2 FO in the IDP system
FO  IDP  Python 

& 
and 

 
or 

~ 
not 

=> 
not present  
= 
== 

~= 
!= 

! 
all 

? 
any 
IDP uses the standard concepts of vocabularies, theories and structures. Each of these has a specific syntactic representation. For example, a map coloring problem can be described in the following vocabulary .
vocabulary V { type Color type Area Border(Area,Area) Coloring(Area): Color }
As can be seen here, IDP in fact uses a typed variant of firstorder logic. The first two statements define two types (which can be seen as unary predicates), whereas the last two statements define, respectively, a predicate and a function.
The following theory in vocabulary consists of a single sentence, expressing that neighboring areas must have a different color. Fig. 1 shows the ASCII symbols that are used in IDP to represent the logical connectives.
theory T : V { !a b: Border(a,b) => Coloring(a) ~= Coloring(b). }
IDP is usually able to automatically derive the types of variables, based on the type declarations in the definition of the vocabulary. This information is important because most of its inference tasks require IDP to first ground (part of) the theory. Instead of depending on IDP’s automatic type derivation, it is also possible to explicitly declare the type of a variable, e.g.:
!a [Area] b [Area]: Border(a,b) => Coloring(a) ~= Coloring(b).
A structure for a vocabulary is represented by an enumeration (1) of the values that belong to each type, (2) of the tuples that belong to each predicate, and (3) of the mapping of tuples to values that is made by each function.
structure S : V { Area = { Belgium; Holland; Germany; } Color = { Blue; Red; Green; } Border = {(Belgium,Holland); (Belgium,Germany); (Holland,Germany)} }
This structure for the vocabulary interprets only part of the vocabulary . In particular, the function is not interpreted. One of the inference tasks supported by the IDP system is that of model expansion: given a structure for a subvocabulary of the vocabulary of a theory , compute a structure for the remaining symbols such that .
IDP exposes its functionality by means of an API in the Lua scripting language. The command printmodels(modelexpand(T,S))
performs the model expansion task for the above structure and theory , resulting in the output:
Coloring = {"Belgium">"Red";"Germany">"Blue";"Holland">"Green"}
By default, a single model expansion is computed, but it is also possible to compute several or all of them. A special case of model expansion occurs when the initial structure already interprets the entire vocabulary of the theory . In this case, it reduces to checking whether .
3 Interfacing with the KB System
This section presents our API for using the IDP KB system from within Python.
3.1 Vocabularies and structures
All interaction with the IDP system is done through objects of the IDP
class. Each such object represents a knowledge base consisting of a triple of a vocabulary , a structure and a theory . It can be created as follows:
The following methods add symbols to the vocabulary of the KB:
As in IDP itself, the typed_name
of a predicate is a string of the form
Foo(Type1, ..., Type2)
and that of a
function is Foo(Type1, ..., Type2):
Return_type
. Because constants are identical to ary functions,
their typed_name
has the form Foo : Type
. Once
a symbol Foo
has been added to the vocabulary of a knowledge base
kb
, it can thereafter be referred to as kb.Foo
.
In addition to declaring a function/predicate symbol (i.e., adding it to the vocabulary of the KB), it is also possible to immediately extend the structure with a particular interpretation for . This is done by adding this interpretation as a second argument. The interpretation of a type must be a set (or list) of values; that of a constant must be a single value; that of a function with arity must be mapping (e.g., a dictionary); and that of a predicate must be a set/list of tuples of the correct arity (for predicates with arity , a set of simple values is also allowed). Obviously, the typing of the symbols must be respected.
Instead of initialising the interpretation of a symbol upon construction, it is also possible to first declare a symbol and then later use the assignment operator to provide an interpretation for it. We illustrate using (part of) the graph coloring example of Sec. 2.2.
The “logical” objects that are thus created implement a number of common Python interfaces, allowing them to act as Python programmers would expect. A relation is, in mathematical terms, a set of tuples. Its natural counterpart is a Python MutableSet object^{4}^{4}4https://docs.python.org/2/library/collections.html#collections.Set (i.e., a set which allows adding/removing of elements). The following interactive session demonstrates some standard usages.
In addition to the standard MutableSet functionality, relations are also callable^{5}^{5}5https://docs.python.org/2/library/functions.html#callable, so that we may also use the standard FO notation for checking membership.
A function is a Mapping^{6}^{6}6https://docs.python.org/2/library/collections.html#collections.Mapping, that is, an object that maps each tuple of values in its domain to a value in its range. Some standard usages are:
As with predicates, function objects are callable to allow for the more FOlike:
3.2 Formulas and definitions
In keeping with our goal of achieving a low learning curve, formulas are written in Python syntax. An overview is shown in Fig. 1. The Python language has the standard boolean operators and
, or
and not
. In addition, it also has the functions all
and any
, which may be applied to lists of boolean values to return the conjunction/disjunction of these values. The latter two functions, together with Python’s list comprehension syntax, can be used as universal/existential quantification. The list comprehension syntax also has an optional if
part, which may be used to represent the common pattern of a universally quantified implication:
all( for x in Type)


any( for x in Type)


all( for x in Type if )

In the graph coloring problem, we need to express the following property:
(1) 
Section 2.2 already presented the IDP syntax for this. In Python, we can write the same property as:
Note that this is just a normal Python expression, which we can, e.g., just type into the interactive terminal. This expression evaluates to True
precisely when property (1) is satisfied. We can make the KB aware of such a constraint by means of its Constraint
method, which takes a string as its argument.
So, the following code adds the above constraint to our graph coloring KB:
The string argument is completely identical to the Python expression we saw above, with one exception: the predicates/functions are simply called Coloring
, Area
and Border
, instead of color.Coloring
, color.Area
and color.Border
. This is because, just as a theory can only contain symbols that appear in its vocabulary, the constraints that are added to a KB always refer to the symbols of that KB.
An alternative formula, which is equivalent to the one above, quantifies over and uses an if
expression to check for membership in :
3.3 Functional interfacing
In keeping with our goal of making the API easy to use, the programmer does not need to explicitly invoke the IDP system. This avoids the need to learn new functions or new terminology, and reduces the possibility of bugs. Instead, invokation of the IDP system happens “automagically” in the following circumstances:

Symbols that have been declared, but for which no interpretation has been provided, are automatically assigned a valid interpretation (in accordance with FO semantics) when their content is inspected. In other words, IDP is used as an oracle to lazily fill in the interpretation of any declared symbols for which the user does not provide one herself. This is done in a way that the interpretations of all symbols together constitutes an FO model of the constraints, i.e., a model expansion task is performed. If the constraints admit multiple models, one is chosen arbitrarily.

The KB object has an attribute
satisfiable
, which is automatically set toTrue
/False
(depending on whether the KB is satisfiable) when the user converts it to a boolean (either explicitly withbool(.)
or by use in anif
statement).
In the previous section, we declared a function color.Coloring
without adding an interpretation for this function. Therefore, the IDP system will be invoked to compute a coloring of our graph if we execute the following code:
Note that if we had added an interpretation for the function Coloring
before executing this code, then this for
loop would still continue work as expected.
This is an important property, because it allows us to change whether a relation/function is computed by the KB base or by native Python code, without having to adjust the code that makes use of it. In the latter case, the call color.Coloring[x]
will just retrieve the precomputed coloring stored within the color
object, without invoking IDP.
Similarly, if a coloring is not provided, then the following code will test whether a given graph can be colored, whereas if it is provided, the same code checks whether it is indeed valid.
3.4 Inductive definitions
An important feature of the IDP system is its ability to handle inductive definitions. It uses a rulebased syntax for representing such definitions, in which, e.g., the transitive closure of a graph can be defined as follows
Note that the arrow symbol here is not material implication, but a special symbol that denotes a “case” in an inductive definition. Such an inductive definition is interpreted under the wellfounded semantics [vrs91], which in the case of a positive definition (such as the one above) boils down to a leastfixpoint construction. Each rule of such a definition represents a single case in which the defined predicate holds. In our Python API, we use a lambdaexpression to represent such a case.
This both declares the predicate TC
and defines it in terms of the “parameter” G
. For definitions consisting of a single rule, a simpler syntax is also allowed:
Similar to how an argument of kb.Constraint(.)
can also be used as a simple boolean Python expression, the above lambdaexpression can be used to compute the transitive closure of by an explicit leastfixpoint computation:
An advantage of using IDP is that the definition of can then not only be used to compute the transitive closure of a given graph, but also to, e.g., compute a graph that would have a given relation as its transitive closure. In addition, IDP not only supports positive inductive definitions, but also nonmonotone inductive definitions (such as the standard definition of the relation “” in FO), for which a simple leastfixpoint construction does not work. Nonrecursive definitions (which are equivalent to a standard FO equivalence) are also allowed in IDP. In the latter case, we can of course choose whether to use the Define
or Predicate
method of our API.
IDP can be configured to use XSB Prolog^{7}^{7}7http://xsb.sourceforge.net/ to speed up certain computation with definitions. We always use this option in the experiments below.
4 Experiments
This section presents two examples of our API, with a particular focus on demonstrating that the integration into the surrounding Python code can be done in a natural way.
4.1 Sudoku
The first example is a Sudoku solver. A Sudoku grid consists of cells.
The grid is divided into in rows, columns and nine small squares.
Here, the Python list comprehensions compute an enumeration of these relations, by iterating over all tuples in the Cartesian product of the argument types and checking a certain condition for each tuple. Alternatively, we can make IDP do this work, by Defin
ing the predicates with the appropriate lambaexpressions.
The cells must be filled with integers from 1 to 9. It will be convenient to represent an empty cell by the number 0, leading to the following type Number.
We make use of two functions that map cells to numbers: one records the problem statement and the other its solution. The problem statement comes from a list of numbers, that we convert to a dictionary by means of zip
operation.
The function Sol
will be computed by IDP, in accordance with the rules of sudoku.
First, we state the difference constraint on the appropriate cells.
Next, we state that the solution must match the problem statement on all nonempty cells (i.e., those ), and that it should fill in all cells.
With this, the sudoku problem is completely specified. The following code passes the sud.Sol
object to a function that prettyprints the sudoku. Only at the start of the for
loop in this function is the solution actually computed.
We remark that, because we use valid Python expressions to assert constraints, we can use the same expressions to check that the output indeed satisfies the constraints. For instance, at the interactive Python terminal:
Peter Norvig has published a sudoku solver written entirely in Python using constraint solving techniques^{8}^{8}8http://norvig.com/sudoku.html. Not counting whitespace and comments, his code to compute solutions is about 40 lines, whereas the code we have presented in this section is 12 lines. Moreover, it is easy to replace the existing solve
function in his code by a call to our knowledge base. This requires two small transformations: first, in Norvig’s code, input grids are given in the format of a single string, where an empty cell is represented by a dot; second, he produces output in the form of a dictionary in which the keys are strings of the form with representing the row and the column.
When it comes to runtime, our version is significantly slower than the original on Norvig’s test set, averaging per sudoku versus .
At the other end of the spectrum, we can also compare to a naive generateandtest approach. Using Python’s powerful itertools
library, this can also be implemented in about 10 lines of code. (The code used to test whether a solution is correct can of course use the same syntactical expressions as those which we passed to our API.) However, the runtime of such a program is very poor: for a sudoku with just 5 empty cells (for comparison, a typical sudoku has around 50 to 60), it already takes over a minute to find a solution.
Our main conclusions from this experiment are that, at least in this case:

Our API can handle, with a limited amount of overhead, the input/output format that a typical Python programmer would use;

Our API can be used to develop useful functionality in significantly fewer lines of code (12 versus 40) than a clever Python implementation. In fact, it takes only as many lines of code as a naive generateandtest algorithm.

Even though our API is significantly slower than a clever Python algorithm, it still vastly outperforms the naive generateandtest approach.
4.2 Working with graphs
The following class GraphKB
extends the generic IDP Knowledge Base class with
some specific functionality for working with undirected graphs. When constructing such a GraphKB
, the nodes of the graph can be initialised by means of a given set and the edges by means of an adjacency list. The predicate
Edge
is Define
d as the symmetric closure of the adjacency
list. This class also offers a convenience method to
define the transitive closure of relations over this graph.
We can now check if a given adjacency list describes a fully connected graph:
We can use a similar KB to count the number of
connected components in the graph. We do this by selecting a single
representative from each component (its “Root
”) and then
counting the number of these representatives.
For a graph with 1000 nodes in 86 connected components, this program takes 23s to count the components. By comparison, the popular NetworkX Python library^{9}^{9}9https://networkx.github.io/ is two orders of magnitude faster, taking only 0.2s.
In graph theory, an undirected graph is called a tree if it is connected and does not contain cycles. When checking for a cycle in an undirected graph, we of course have to exclude the trivial twonode cycles that would result from traversing the same undirected edge in both directions. This in fact makes it easier to use IDP to check that there is a cycle, than to check that there is not one. The following knowledge base tries to guess the direction in which to traverse each edge in order to produce a cycle. If it is unsatisfiable, there are no cycles.
We can now combine the two knowledge bases to check whether a given adjacency list indeed describes a tree.
This example illustrates how additional functionality can be built on top of the KB objects of our API. In addition, the ability to combine the results of calls to different KBs also allows us to implement functionality that would be harder to implement in a single IDP KB.
5 Implementation
The implementation of our API and the examples are available for download.^{10}^{10}10https://bitbucket.org/joostv/pyidp/admin Interfacing with the IDP system is currently done in a decoupled way: when the API detects that the IDP system needs to be called, it prepares a text file with the appropriate vocabulary, structure and theory, then calles the IDP system as an external process and parses its output. The results of this call are cached, so that the IDP system will not be invoked again until the KB changes.
6 Related work
There is already a long history of work attempting to close the gap between imperative and declarative programming [apt98]. We briefly compare our approach to some recent work in this area.
In [torlak13], an approach is presented in which a constraint solver is not added to a single host language, but can be used in the development of a domainspecific language in Racket. Like ours, the motivation behind this work is to allow the power of declarative systems to be more widely used. However, their approach differs, because they count on an intermediary—the designer of the domainspecific language—to hide the complexity of the declarative system, whereas our approach focuses on creating an interface that is natural enough to offer KB functionality directly.
In [koksal12], a constraint solver is integrated into the Scala language. As ours does, their approach reuses the syntax of the host language to interface with the declarative system. A key difference is that, in their approach, the programmer is explicitly manipulating, combining and solving constraints, which makes the constraint solver more present in the eventual source code. A second difference is of course that Scala currently appears to be less widely known than Python.
In [Milicevic11], a reasoner for FO extended with transitive closure is integrated into Java. Their KB language is therefore very similar to (but more restricted than) that of IDP. When it comes to the integration in Java, there are two main differences to our approach. First, the declarative knowledge is not written in expressions in the host language, but in a separate language (the Alloylike JFSL [yessenov09]). Second, the integration into Java is done in an objectoriented way: the programmer defines classes in which formulas are added as, among others, class invariants, method pre/postconditions and frame conditions. In comparison, our Python API seems more lightweight, since it does not require an objectoriented approach. When it comes to computational performance, [Milicevic11] reports good results, which our implementation is not able to match.
In summary, our approach fills the niche of an easytolearn quick prototyping API, that, due to Python’s current popularity, may speak to a large audience.
7 Conclusions and future work
When prototyping an application, a programmer may encounter a computational subproblem for which it would be cumbersome to develop a specific algorithm. The aim of our API is to allow such gaps to be declaratively stopped with as little effort as possible. As we have seen, our API might allow a feasible solution to be produced in only as many lines of code as an (infeasible) naive generateandtest algorithm. Our use of standard Python objects such as sets and mappings means that no elaborate setup code is required to plug the KB into an existing code base, while our use of standard Python expressions for constraints and definitions leads to a low learning curve. In addition, both these properties also make it easier to eventually remove the KB if a more efficient solution is required: the same KB that first generated the solution, can later be used to check its correctness, or its constraints may simply be recuperated in the form of Python assert
statements.
To prevent changing/removing the KB from leading to code changes elsewhere, our API makes all calls to the IDP system automatically, whenever they are needed. This has the additional benefit of simplifying the API and not forcing the programmer to learn new terminology. A downside is that it is harder for the programmer to keep track of what is happening when in the program.
Our current implementation of the API is naive in its interfacing with the IDP system, which happens by passing text files (built each time from skratch) to an external process. A better integration, which exploits the Lua interface of IDP, might offer a significant reduction in runtimes. However, since we mainly intend our API to be used in prototyping, this issue might not be pressing. Another consequence, which may be more severe, is that programs written in our API are currently hard to debug: it may be necessary to manually inspect the text file that was passed to the IDP system (in debugmode, the API always sends this to standard output). However, this requires the user to be at least somewhat familiar with IDP input syntax, which is something we aimed to avoid.
Our validation of the API currently consists only of examples that we have implemented ourselves. A better test would involve Python programmers who have no knowledge of IDP or indeed any declarative system. However, better debugging facilities seem necessary for such a trial to be successful.