## 1 Introduction

Answering conjunctive queries (CQs) over OWL 2 DL ontologies is a
computationally hard [GHLS08a, DBLP:conf/cade/Lutz08], but key problem in
many applications. Thus, considerable effort has been devoted to the
development of OWL 2 DL fragments for which query answering is tractable in
*data complexity*, which is measured in the size of the data only. Most
languages obtained in this way are *Horn*: ontologies in such languages
can always be translated into first-order Horn clauses. This includes the
families of ‘lightweight’ languages such as DL-Lite [CDLLR07b],
[DBLP:conf/ijcai/BaaderBL05], and DLP
[DBLP:conf/www/GrosofHVD03] that underpin the QL, EL, and RL profiles of
OWL 2, respectively, as well as more expressive languages, such as
Horn- [DBLP:conf/ijcai/HustadtMS05] and
Horn- [DBLP:conf/ijcai/OrtizRS11].

Query answering can sometimes be implemented via query rewriting: a rewriting of a query w.r.t. an ontology is another query that captures all information from necessary to answer over an arbitrary data set. Unions of conjunctive queries (UCQs) and datalog are common target languages for query rewriting. They ensure tractability w.r.t. data complexity, while enabling the reuse of optimised data management systems: UCQs can be answered using relational databases [CDLLR07b]

, and datalog queries can be answered using rule-based systems such as OWLim

[bishop2011owlim] and Oracle’s Semantic Data Store [Wu08]. Query rewriting algorithms have so far been developed mainly for Horn fragments of OWL 2 DL, and they have been implemented in systems such as QuOnto [DBLP:conf/aaai/AcciarriCGLLPR05], Rapid [Chortaras11], Presto [DBLP:conf/kr/RosatiA10], Quest [DBLP:conf/kr/Rodriguez-MuroC12], Clipper [DBLP:conf/aaai/EiterOSTX12], Owlgres [DBLP:conf/owled/StockerS08], and Requiem [Hector10a].Horn fragments of OWL 2 DL cannot capture *disjunctive knowledge*, such
as ‘every student is either an undergraduate or a graduate’. Such knowledge
occurs in practice in ontologies such as the NCI Thesaurus and the
Foundational Model of Anatomy, so these ontologies cannot be processed using
known rewriting techniques; furthermore, no query answering technique we are
aware of is tractable w.r.t. data complexity when applied to such ontologies.
These limitations cannot be easily overcome: query answering in even the basic
non-Horn language is -hard w.r.t. data complexity
[DBLP:conf/lpar/KrisnadhiL07], and since answering datalog queries is
-complete, it may not be possible to rewrite an arbitrary
ontology into datalog unless . Furthermore,
Lutz:2012ug showed that tractability w.r.t. data complexity cannot be
achieved for an arbitrary non-Horn ontology with ‘real’ disjunctions: for
each such , a query exists such that answering w.r.t. is
-hard.

The result by Lutz:2012ug, however, depends on an interaction between
existentially quantified variables in and disjunctions in . Motivated
by this observation, we consider the problem of computing datalog rewritings
of *ground* queries (i.e., queries whose answers must map all the
variables in to constants) over non-Horn ontologies. Apart from allowing
us to overcome the negative result by Lutz:2012ug, this also allows us
to compute a rewriting of that can be used to answer an arbitrary ground
query. Such queries form the basis of SPARQL, which makes our results
practically relevant. We summarise our results as follows.

In Section LABEL:sec:NegativeResults, we revisit the limits of datalog rewritability for a language as a whole and show that non-rewritability of ontologies is independent from any complexity-theoretic assumptions. More precisely, we present an ontology for which query answering cannot be decided by a family of monotone circuits of polynomial size, which contradicts the results by Afrati:1995un, who proved that fact entailment in a fixed datalog program can be decided using monotone circuits of polynomial size. Thus, instead of relying on complexity arguments, we compare the lengths of proofs in and datalog and show that the proofs in may be considerably longer than the proofs in datalog.

In Section LABEL:sec:DatalogRewritings, we present a three-step procedure that
takes a -ontology and attempts to rewrite into a datalog
program. First, we use a novel technique to rewrite into a TBox
without transitivity axioms while preserving entailment of
*all* ground atoms; this is in contrast to the standard techniques (see,
e.g., [hms07reasoning]), which preserve entailments only of unary facts
and binary facts with roles not having transitive subroles. Second, we use the
algorithm by hms07reasoning to rewrite into a
*disjunctive datalog program* . Third, we adapt the
knowledge compilation technique by DBLP:journals/ai/Val05 and
selman1996knowledge to transform into a datalog
program. The final step is not guaranteed to terminate in general; however, if
it terminates, the resulting program is a rewriting of .

In Section LABEL:sec:Termination, we show that our procedure always terminates if is a -ontology—a practically-relevant language that extends OWL 2 QL with transitive roles and Boolean connectives. Artale09thedl-lite proved that the data complexity of concept queries in this language is tractable (i.e., -complete). We extend this result to all ground queries and thus obtain a goal-oriented rewriting algorithm that may be suitable for practical use.

Our technique, as well as most rewriting techniques known in the literature,
is based on a sound inference system and thus produces only *strong
rewritings*—that is, rewritings entailed by the original ontology. In
Section LABEL:sec:LimitsStrong we show that non-Horn ontologies exist that can
be rewritten into datalog, but that have no strong rewritings. This highlights
the limits of techniques based on sound inferences. It is also surprising
since all known rewriting techniques for Horn fragments of OWL 2 DL known to
us produce only strong rewritings.

The proofs of all of our technical results are given in