1 Introduction
Answer Set Programming (ASP) [Brewka et al. (2011)] is a logicbased declarative modeling language and problem solving framework [Gebser et al. (2012)]
for hard computational problems and an active research area in artificial intelligence (AI) and knowledge representation and reasoning. It has been applied both in academia
[Balduccini et al. (2006), Gebser et al. (2010), Gebser et al. (2011)] and industry [Gebser et al. (2011), Guziolowski et al. (2013), Ricca et al. (2012)]. In propositional ASP questions are encoded by atoms combined into rules and constraints which form a logic program. Solutions to the program consist of sets of atoms called answer sets; if no solutions exist then the program is inconsistent.Knowledge representation languages like ASP are usually considered explainable AI, as they are based on deduction, which is an explainable procedure. For example, we can easily explain answer sets of a normal logic program in terms of program reducts and fixpoint operators [Liu et al. (2010)]. In this case, we may argue that answer sets are selfexplanatory, and therefore ASP systems providing answer sets are explainable AI systems. However, modern ASP systems do not provide any explanation for inconsistent programs; there is no witness that can be checked or evidence of the correctness of the refutation of the input program. Hence, even if inconsistency of logic programs is anyhow explainable with mathematical rigour, ASP systems are essentially blackboxes in this case, and just report the absence of answer sets. We believe that adding inconsistency proofs in ASP systems is important to make them explainable for inconsistent programs, but also provides auditability for consistent programs. Thanks to a duality result in the literature [Pearce (1999)], such inconsistency proofs for ASP can be also used as a certificate for the validity of some formulas of intuitionistic logic and other intermediate logics. A further application of these inconsistency proofs is query answering in ASP, which is usually achieved by inconsistency checks. There, the goal is to provide a witness for cautiously true answers of a given query.
Modern ASP solvers have been highly influenced by SAT solvers, which solve the Boolean satisfiability problem and are often based on conflictdriven clause learning [Silva and Sakallah (1999)]. Typically, ASP solvers aim for computing an answer set of a given program, and therefore solve the consistency problem that asks whether a given program has an answer set. This problem is on the second level of the polynomial hierarchy when allowing arbitrary propositional disjunctive programs as input and on the first level when restricting to disjunctionfree programs [Truszczyński (2011)]. As already stated, while consistency of a program can be easily verified given such a computed answer set, verifying whether a solver correctly outputted that a program is inconsistent, is not immediate. Given that ASP solvers are also used for critical applications [Gebser et al. (2018), Haubelt et al. (2018)], their correctness is of utter importance.
When looking at SAT solvers, various techniques have been developed to ensure correctness of unsatisfiability, such as clausal proof variants [Gelder (2008), Goldberg and Novikov (2003)] based on clauses that have RUP (reverse unit propagation) and RAT (Resolution Asymmetric Tautology) property. These proof formats share verifiability in polynomial time in the size of the proof and input formula and can be tightly coupled with modern solving techniques. A solver outputs such a proof during solving. Thereby, the correctness of solving can be verified by a relatively simple method for every input instance. While there are variants of these proofs for various problems, such as extensions to verify the validity of quantified Boolean formulas [Heule et al. (2013), Wetzler et al. (2014)] (QRAT [Heule et al. (2014)] and QRAT+ [Lonsing and Egly (2018)]), to our knowledge such a format is not yet available for verifying ASP solvers. One approach to certify inconsistency of a given normal program is to translate the program into a SAT formula in polynomial time [Lin and Zhao (2003), Janhunen (2006)] and obtain a proof from a SAT solver, e.g., via a RATbased proof format, to verify that indeed the program is inconsistent. Unfortunately, this approach does not take the techniques into account that are employed by stateoftheart ASP solvers and therefore seem to lack efficiency and scalability. Further, this is still not a suitable technique for disjunctive programs, nor to verify whether internally the ASP solver is able to correctly explain the obtained result.
We follow this line of research and establish the following novel results for ASP:

We present the proof format ASPDRUPE based on RUP for logic programs given in SModels [Syrjänen (2000)] or ASPIF [Gebser et al. (2016)] (restricted to ASP without theory reasoning) input format including disjunctive programs and show its correctness.

We provide an algorithm for verifying that a given solving trace in the ASPDRUPE format is indeed a valid proof for inconsistency of the input program. This algorithm works in polynomial time in the size of the given solving trace.

We illustrate on an abstract ASP solving algorithm how one can integrate ASPDRUPE into stateoftheart ASP solvers like clasp [Gebser et al. (2012)] and wasp [Alviano et al. (2015)].

We provide an implementation in a variant of wasp, where ASPDRUPE is integrated for normal ASP. This variant of wasp is able to not only explain inconsistency for inconsistent logic programs, but also provides auditability in case of consistency for verifying whether the provided answer set was indeed correctly obtained.
Related Work.
Heule et al. [Heule et al. (2013)] presented a proof format based on the RAT property and subsequently a program to validate solving traces in this format [Wetzler et al. (2014)]. Extended resolution allows to polynomially simulate the DRAT format [Kiesl et al. (2018)] and viceversa [Wetzler et al. (2014)]. Many advanced techniques, such as XOR reasoning [Philipp and RebolaPardo (2016)] as well as symmetry breaking [Heule et al. (2015)] can be expressed in DRAT and efficient, verified checkers based on RAT have been developed [CruzFilipe et al. (2017)]. Further, RAT is also available for QBF [Heule et al. (2014)] and has been extended to cover a more powerful redundancy property [Lonsing and Egly (2018)].
2 Preliminaries
2.1 Answer Set Programming (ASP)
We follow standard definitions of propositional ASP [Brewka et al. (2011)] and use rules defined by the SModels [Syrjänen (2000)] or ASPIF [Gebser et al. (2016)] (restricted to ASP without theory reasoning) input format, which is widely supported by modern ASP solvers. In particular, let , , be nonnegative integers such that , , , propositional atoms, , , , nonnegative integers. A choice rule is an expression of the form , a disjunctive rule is of the form , and a weight rule is of the form , where . A rule is either a disjunctive, a choice, or a weight rule. A (disjunctive logic) program is a finite set of rules. For a rule , we let , , , and is a set of literals, i.e., an atom or the negation thereof. We denote the sets of atoms occurring in a rule or in a program by and . For a weight rule , let map literal to its weight in rule if for , or if for , and to otherwise. Further, let for a set of literals and let be its bound. A normal (logic) program is a disjunctive program with for every . The positive dependency digraph of is the digraph defined on the set of atoms, where for every rule two atoms and are joined by an edge . We denote the set of all cycles (loops) in by . A program is called tight, if . While we allow programs with loops that might also involve atoms of weight rules, we consider weight rules only as a compact representation of a set of normal rules, similar to the definition of stable models in related work [Bomanson et al. (2016)]. In other words, we do not consider advanced semantics concerning recursive weight rules (recursive aggregates). In case of solvers with different semantics, one can restrict the input language to disregard recursive weight rules, which is also in accordance with the latest ASPCore2.03c standard [Calimeri et al. (2015)]. This restriction is motivated by a lack of consensus on the interpretation of recursive weight rules [Ferraris (2011), Faber et al. (2011), Gelfond and Zhang (2014), Pelov et al. (2007), Son and Pontelli (2007)].
2.2 Solving Logic Programs
Let be a given program, be a rule, and . We define the set of induced bodies with in the head by the singleton if is a choice rule, by if is a disjunctive rule, and by the union over for every (subsetminimal) set of literals such that , if is a weight rule. This allows us to define , and . A variable assignment is either or , where variable is either an atom, or an induced body, or a fresh atom that does not occur in . For a variable assignment , is the complementary variable assignment of , i.e., if and if . An assignment is a set of variable assignments, where , , and such that . For a set of literals, we define the induced assignment . A nogood is an assignment, which is not allowed, where refers to the empty nogood. Given a set of nogoods. We define the least fixpoint of unit propagated nogoods by the fixpoint computation , and for . Nogood is a consequence using unit propagation (UP) of set , denoted by , if . An assignment satisfies a set of nogoods (written ) if for every , we have . Set of nogoods is a consequence of a set of nogoods (denoted by ) if every assignment, which contains a variable assignment for all variables in , that satisfies also satisfies . The set of completion nogoods [Clark (1977), Gebser et al. (2012)] is defined by , where
Note that in practice, current ASP solvers do not fully compute . Instead, these solvers partially compute and add relevant nogoods lazily during solving [Alviano et al. (2018)].
Then, if is tight the set is an answerset if and only if there is a satisfying assignment for [Fages (1994), Gebser et al. (2012)]. The set of external bodies of program and set of atoms are given by [Gebser et al. (2012)]. We define the loop nogood for an atom on a loop by . For a logic program , the set is an answer set if and only if there is a satisfying assignment for , where [Lin and Zhao (2003), Faber (2005)].
3 ASPDRUPE: RUPlike Format for Proof Logging
Inspired by RUPstyle unsatisfiability proofs in the field of Boolean satisfiability solving [Goldberg and Novikov (2003)], we aim for a proof of inconsistency of a program. Since modern ASP solvers use Clark’s completion [Clark (1977)] to transform a program into a set of nogoods, we do so as well. Our aimed proof then has the following features:

Existence of a simple verification algorithm. In order to increase confidence in the correctness of results, the algorithm that verifies the proof has to be fairly easy to understand and to implement.

Low complexity. The proof is verifiable in polynomial time in its length and the size of the completion nogoods.

Integrability into solvers that employ ConflictDriven Nogood Learning (CDNL). The proof can stepwise be outputted during solving with minimal impact on the solving algorithm and hence the solver.
The method works as follows: We run the solver on the set of completion nogoods for given input program . The solver outputs either an answer set or that has no answer set and a proof . We pass together with to the verifier in order to validate whether the solver’s assessment is in accordance with its outputted proof.
3.1 The Proof Format for Logic Programs
The basic idea of clausal proofs for SAT is the following: One starts with the input formula in CNF (given as a set of clauses). Every step of the proof denotes an addition or deletion of a clause to/from the set of clauses. For additions, the condition is to only add clauses that are a logical consequence of the current set of clauses and that it can be checked easily, e.g., use only unit propagation.
For our format ASPDRUPE, we consider Clark’s completion as the initial set of nogoods corresponding to the input program . Besides addition and deletion of nogoods, we need proof steps that model how the solver excludes unfounded sets (loops).
Example 1
Consider program , which is inconsistent. contains only the positive loop , whose external support is given by the set of rules, and thus . Set induces two possible loop nogoods, and .
We describe the proof format ASPDRUPE for logic programs and adapt the RUP property [Goldberg and Novikov (2003)] to nogoods as follows.
Definition 1 (nogood RUP)
Let be a set of nogoods. Then, a nogood is RUP (reverse unit propagable) for if , i.e., we can derive using only unit propagation.
A proof step is a triple , where denotes the type of the step, is an assignment, and is an atom or . The type indicates whether the step is an addition (), a completion rule addition (), a completion support addition (), an extension (), a deletion (), or a loop addition (). A proof sequence for a logic program is a finite sequence of proof steps. Initially, a proof sequence gets associated with a set of nogoods. Note that although the set might be exponential in the size of the program , body definitions for body variables that do not occur in the proof are never materialized. Then, each proof step for subsequently transforms into the induced set of nogoods, formally defined below. An ASPDRUPE derivation is a proof sequence that allows for RUPlike rules for ASP and includes both deletion and extension. In an ASPDRUPE derivation each step has to satisfy a condition depending on its type as follows:

An addition inserts a nogood that is RUP for .

A completion rule addition inserts a nogood .

A completion support addition inserts a nogood if .

An extension introduces a definition that renders nogood equivalent to a fresh atom , i.e., does not appear in . Formally, this rule represents the set of extension nogoods.

A deletion represents the deletion of from .

A loop addition^{1}^{1}1There could be an exponential number of external bodies involving weight rules. However, both clasp and wasp treat weight rules differently [Alviano et al. (2015)]. Alternatively, one could easily modify the loop addition type to list also involved external bodies (as in the completion support addition type), which we did not for the sake of readability. inserts a loop nogood for a loop .
Given an ASPDRUPE derivation , we define the set of nogoods induced by step as the result of applying proof steps to the initial completion body definitions for . For our inductive definition in the following, we use multiset semantics for additions and deletions of nogoods, and write for the multiset sum.
Then, we say that an ASPDRUPE derivation is an ASPDRUPE proof for the inconsistency of if it actually derives inconsistency for , formally, . Note that might be exponential in the input program size, in the worst case. However, there is no need to materialize the set , as, intuitively, this set of body definitions only ensures that every induced body has a reserved auxiliary atom that can be used to “address” the body in a compact way. In an actual implementation of a solver that uses ASPDRUPE, one needs to specify these used auxiliary atoms anyway, cf. Section 5, where implementational specifications of ASPDRUPE are described.
Example 2
Consider program from Example 1 and loop , which induces loop nogood . Then, the proof sequence is an ASPDRUPE proof for the inconsistency of with
We show that the proof step is correct, i.e., that is RUP for . To this end, we need to derive from by unit propagation. With the nogood , we derive the unit nogood . With we now get . With these unit nogoods, reduces to .
3.2 Correctness of ASPDRUPE
Next, we establish soundness and completeness of the ASPDRUPE format.
Lemma 1 (Invariants)
Let be a logic program and be a finite ASPDRUPE derivation for program . Moreover, let be the accumulated set of nogoods introduced by the extension rules in for all . Then, the following holds: .
[Proof (Sketch).] We proceed by induction over the length of the derivation. For the base case, we have . Hence, and the claim holds trivially. For the induction step, we assume that the statement holds for length and consider step . It remains to do a case distinction for the type:

Deletion with : Immediately, we have . Thus, transitivity of and the induction hypothesis establishes this case.

Addition with : Since is RUP for , we know that is a logical consequence of . The remaining steps to draw the conclusion are similar to the deletion step case.

Completion Rule Addition with : Since the resulting nogood is contained in , the result follows.

Completion Support Addition with : Since the resulting nogood is contained in , the result follows.

Extension with : According to the induction hypothesis we have . Then, , since is a fresh variable and . As is monotone, and , we know that . It then follows that .

Loop addition with : By definition nogood is already contained in , which immediately establishes this case.
Theorem 1 (Soundness and Completeness)
Let be a logic program. Then, is inconsistent if and only if there is an ASPDRUPE proof for .
[Proof (Sketch).] Let be a logic program. “”: Assume there is an ASPDRUPE proof of . By definition, there is a finite sequence of proof steps such that and is inconsistent. From Lemma 1, we obtain that is inconsistent. As consists of extension nogoods with disjoint variables, we know that is inconsistent. We conclude from an earlier result [Gebser et al. (2012), Theorem 5.4] that is inconsistent. “”: Suppose is inconsistent. According to earlier work [Gebser et al. (2012), Theorem 5.4], we know that is inconsistent. RUP is complete [Gelder (2008), Goldberg and Novikov (2003)], which means that for every propositional, unsatisfiable formula there is a RUP proof for . Hence, we can construct an ASPDRUPE proof for as follows: (i) Output all completion rule additions for and completion support additions for . (ii) Generate loop addition steps for all loops . (iii) Transform into a propositional formula by inverting all nogoods. (iv) Construct and use a RUP proof for . Then, output addition rules accordingly, where again all clauses need to be inverted to obtain addition proof steps using nogoods.
Note that in the onlyif direction of the proof, one can also use RAT [Wetzler et al. (2014)] proofs without deletion information and afterwards translate RAT steps into extended resolution steps [Kiesl et al. (2018)].
Listing 1 presents the ASPDRUPE checker, that decides whether a given ASPDRUPE proof is correct. The input to the checker is both the original program and the proof . To check the proof, we encode into nogoods and then check each statement sequentially.
Lemma 2
For a given logic program and an ASPDRUPE derivation , the ASPDRUPEChecker runs in time at most .
Corollary 1
Given a logic program and an ASPDRUPE derivation . Then, the ASPDRUPEChecker is correct, i.e., it outputs Success if and only if is an ASPDRUPE proof for the inconsistency of .
3.3 Extension to Optimization
Next, we briefly mention how to verify cost optimization. To this end, an optimization rule is an expression of the form , where is a literal. Intuitively this indicates that if an assignment satisfies , then this results in costs . Overall, one aims to minimize the total costs, i.e., the goal is to deliver an answer set of minimal total costs. Therefore, if one wants to verify, whether a given answer set candidate is indeed an answer set of minimal costs, we foresee the following extension to ASPDRUPE, where such a proof consists of the following two parts. (i) An answer set that shows a solution with costs exists. (ii) An ASPDRUPE proof that shows that the program restricted to costs is inconsistent. Note that for disjunctive programs already the first part also needed to contain a second proof showing that indeed there cannot be an unfounded set for the provided answer set. Further, it is not immediate, how this extends to unsatisfiable cores. Hence, so far it only applies to progression based approaches.
4 Integrating ASPDRUPE Proofs into a Solver
In the following, we describe the CDNLASP algorithm for logic programs that we use as a basis for our theoretical model. Afterwards, we describe how proof logging can be integrated. In other words, during the run of an ASP solver, we immediately print the corresponding ASPDRUPE rules that are needed later for verification in case the ASP solver concludes that the program is inconsistent. A typical CDNLbased ASP solver (cf., Listing 2) relies on unit propagation, since this is a rather simple and efficient way of concluding consequences. Thereby it keeps a set of nogoods, a current assignment , and a decision level . In a loop it applies NogoodPropagation [Gebser et al. (2012), page 101] consisting of unit propagation and loop propagation (using UnfoundedSet [Gebser et al. (2012), page 104]) whenever suited. Then, if there is some nogood that is not satisfied, either the program is inconsistent (at decision level 0) or ConflictAnalysis [Gebser et al. (2012), page 108] triggers backtracking to an earlier decision level, followed by the learning of a conflict nogood . Otherwise, if all nogoods in are satisfied and all the variables are assigned, an answer set is found, and otherwise some free variable is selected (Select).
Listings 2 and 3 contain a prototypical CDNLbased ASP solver that is extended by proof logging, where the changes for proof logging are highlighted in red. We use the element operator () to determine whether an element is in a sequence, and denote the concatenation of two proofs by the operator as follows: . The idea is to start with an empty ASPDRUPE derivation. Whenever a new nogood, or loop nogood is learned and added to accordingly, this results in an added addition or loop addition proof step, respectively. Note that in Listing 3 we add completion rule addition steps and completion support addition steps, whenever unit propagation (or conflicts) involve rules in or , respectively. In particular, Lines 3 and 3 take care of adding involved parts of the completion to the proof (if needed) accordingly. At the end, when the ASP solver concludes inconsistency, the proof is returned including the empty nogood as last nogood. Note that advanced techniques (see, e.g., [Gebser et al. (2012)]) like forgetting of learned clauses and restarting of the ASP solver can also be implemented using deletion rules with ASPDRUPE. As it turns out, preprocessing in ASP is less sophisticated as for SAT. In the literature, CDNLbased ASP solvers often use preprocessing techniques [Gebser et al. (2008)] similar to SAT solvers, i.e., SatElitelike [Eén and Biere (2005)] operations as variable and nogood elimination. For simple preprocessing operations restricted to variable and nogood elimination ASPDRUPE suffices. Note that if Clark’s completion is exponential in the program size due to weight rules, also propagators [Alviano et al. (2018)] are supported. For details we refer to the implementation in Section 4.1.
Example 3 (CdnlAspDrupe)
We continue the previous Example 2 and indicate a possible CDNLASPDRUPE run on that leads to the exemplary ASPDRUPE proof given above. We use the notation ( to indicate that was assigned true (false) at decision level .

Initially, nothing can be propagated.

After the decision , unit propagation derives only