1 Introduction
Consider the counting version of graph reachability problem: given a directed graph, count the number of subgraphs in which node is reachable from node [Valiant1979]. This problem can be naturally modeled as a logic program under stable model semantics [Gelfond and Lifschitz1988]. Let us say that the input is given by two predicates: and . For each node, we can introduce a decision variable that models whether the node is in the subgraph. Furthermore, we can model reachability () from as an inductive definition using the following two rules: and . The first rule says that itself is reachable if it is in the subgraph. The second rule is the inductive case, specifying that a node is reachable if it is in the subgraph and there is a reachable node that has an edge to it. Additionally, say there are arbitrary constraints in our problem, e.g., only consider subgraphs where a certain is also reachable from etc. This can be done using the integrity constraint: . The number of stable models of this program is equal to the number of solutions of the problem.
There are at least two approaches to counting the stable models of a logic program. The first is to translate a given logic program to a propositional theory such that there is a onetoone correspondence between the propositional models of the translated program and the stable models of the original program, and use SAT model counting [Gomes, Sabharwal, and Selman2008]. We show that this approach does not scale well in practice since such translations, if done a priori, can grow exponentially with the input size. The second approach is to use an answer set programming (ASP) solver like clasp [Gebser et al.2007] or dlv [Leone et al.2006] and enumerate all models. This approach is extremely inefficient since model counting algorithms have several optimizations like caching and dynamic decomposition that are not present in ASP solvers. This motivates us to build a stable model counter that can take advantage of stateoftheart ASP technology which combines partial translation and lazy unfounded set [Van Gelder, Ross, and Schlipf1988] detection. However, we first show that it is not correct to naively combine partial translation and lazy unfounded set detection with SAT model counters due to the aforementioned optimizations in model counters. We then suggest two approaches to properly integrate unfounded set propagation in a model counter.
We show that we can apply our algorithms to solve probabilistic logic programs [Raedt and Kimmig2013]. Consider the probabilistic version of the above problem, also called the graph reliability problem [Arora and Barak2009]. In this version, each node can be in the subgraph with a certain probability , or equivalently, fail with the probability . We can model this by simply attaching probabilities to the variables. We can model observed evidence as constraints. E.g., if we have evidence that a certain node is reachable from , then we can model this as the unary constraint (not rule): . The goal of the problem is to calculate the probability of node being reachable from node given the evidence. The probabilistic logic programming solver Problog2 [Fierens et al.2011] approaches this inference task by reducing it to weighted model counting of the translated propositional theory of the original logic program. We extend Problog2 to use our implementation of stable model counting on the original logic program and show that our approach is more scalable.
2 Preliminaries
We consider propositional variables . Each is (also) a positive literal, and is a negative literal. Negation of a literal, is if , and if . An assignment is a set of literals which represents the literals which are true in the assignment, where . If is a formula or assignment, let be the subset of appearing in . Given an assignment , let and . Two assignments and agree on variables , written , if and . Given a partial assignment and a Boolean formula , let be the residual of w.r.t. . is constructed from by substituting each literal with and each literal with and simplifying the resulting formula. For a formula , is the number of assignments to that satisfy .
2.1 DPLLBased Model Counting
State of the art SAT model counters are very similar to SAT solvers, but have three important optimisations. The first optimisation is to count solution cubes (i.e., partial assignments , whose every extension is a solution) instead of individual solutions. Consider the Boolean formula: , , , . Suppose the current partial assignment is . The formula is already satisfied irrespective of values of and . Instead of searching further and finding all 4 solutions, we can stop and record that we have found a solution cube containing solutions, where is the number of unfixed variables.
The second important optimisation is caching. Different partial assignments can lead to identical subproblems which contain the same number of solutions. By caching such counts, we can potentially save significant repeated work. For a formula and an assignment , the number of solutions of under the subtree with is given by . We can use the residual as the key and cache the number of solutions the subproblem has. For example, consider again. Suppose we first encountered the partial assignment . Then . After searching this subtree, we find that this subproblem has 2 solutions and cache this result. The subtree under thus has solutions. Suppose we later encounter . We find that is the same as . By looking it up in the cache, we can see that this subproblem has 2 solutions. Thus the subtree under has solutions.
The last optimisation is dynamic decomposition. Suppose after fixing some variables, the residual decomposes into two or more formulas involving disjoint sets of variables. We can count the number of solutions for each of them individually and multiply them together to get the right result. Consider and a partial assignment . The residual program can be decomposed into two components and with variables and respectively. Their counts are 3 and 5 respectively, therefore, the number of solutions for that extend the assignment is . The combination of the three optimisations described above into a DPLL style backtracking algorithm has been shown to be very efficient for model counting. See [Bacchus, Dalmao, and Pitassi2003, Gomes, Sabharwal, and Selman2008, Sang et al.2004] for more details.
2.2 Answer Set Programming
We consider split into two disjoint sets of variables founded variables () and standard variables (). An ASPSAT program is a tuple where is a set of rules of form: such that and and is a set of constraints over the variables represented as disjunctive clauses. A rule is positive if its body only contains positive founded literals. The least assignment of a set of positive rules , written is one that that satisfies all the rules and contains the least number of positive literals. Given an assignment and a program , the reduct of w.r.t. , written, is a set of positive rules that is obtained as follows: for every rule , if any , or for any standard positive literal, then is discarded, otherwise, all negative literals and standard variables are removed from and it is included in the reduct. An assignment is a stable model of a program iff it satisfies all its constraints and . Given an assignment and a set of rules , the residual rules are defined similarly to residual clauses by treating every rule as its logically equivalent clause. A program is stratified iff it admits a mapping from to nonnegative integers where for each rule in the program s.t., referring to the above rule form, whenever for and whenever for . In ASP terms, standard variables, founded variables and constraints are equivalent to choice variables, regular ASP variables, and integrity constraints resp. We opt for the above representation because it is closer to SATbased implementation of modern ASP solvers.
3 SATBased Stable Model Counting
The most straight forward approach to counting the stable models of a logic program is to translate the program into propositional theory and use a propositional model counter. As long as the translation produces a onetoone correspondence between the stable models of the program and the solutions of the translated program, we get the right stable model count. Unfortunately, this is not a very scalable approach. Translations based on adding loop formulas [Lin and Zhao2004] or the proofbased translation used in Problog2 [Fierens et al.2011] require the addition of an exponential number of clauses in general (see [Lifschitz and Razborov2006] and [Vlasselaer et al.2014] respectively). Polynomial sized translations based on level rankings [Janhunen2004] do exist, but do not produce a one to one correspondence between the stable models and the solutions and thus are inappropriate for stable model counting.
Current state of the art SATbased ASP solvers do not rely on a full translation to SAT. Instead, they rely on lazy unfounded set detection. In such solvers, only the rules are translated to SAT. There is an extra component in the solver which detects unfounded sets and lazily adds the corresponding loop formulas to the program as required [Gebser, Kaufmann, and Schaub2012]. Such an approach is much more scalable for solving ASP problems. However, it cannot be naively combined with a standard SAT model counter algorithm. This is because the SAT model counter requires the entire Boolean formula to be available so that it can check if all clauses are satisfied to calculate the residual program. However, in this case, the loop formulas are being lazily generated and many of them are not yet available to the model counter. Naively combining the two can give the wrong results, as illustrated in the next example.
Example 1.
Consider a program with founded variables , standard variables and rules: . There are only two stable models of the program and . If our partial assignment is , then the residual program contains an empty theory which means that the number of solutions extending this assignment is (or ). This is clearly wrong, since is not a stable model of the program.
Now consider which is equal to with these additions: founded variable , standard variables and two rules: and . Consider the partial assignment , the residual program has only one rule: . It has two stable models, and . Now, with the partial assignment , we get the same residual program and the number of solutions should be: which is wrong since cannot be false in order for to be true when is false, i.e., and are not stable models of .
In order to create a stable model counter which can take advantage of the scalability of lazy unfounded set detection, we need to do two things: 1) identify the conditions for which the ASP program is fully satisfied and thus we have found a cube of stable models, 2) identify what the residual of an ASP program is so that we can take advantage of caching and dynamic decomposition.
3.1 Searching on Standard Variables for Stratified Programs
The first strategy is simply to restrict the search to standard variables. If the program is stratified, then the founded variables of the program are functionally defined by the standard variables of the program. Once the standard variables are fixed, all the founded variables are fixed through propagation (unit propagation on rules and the unfounded set propagation). It is important in this approach that the propagation on the founded variables is only carried out on the rules of the program, and not the constraints. Constraints involving founded variables should only be checked once the founded variables are fixed. The reader can verify that in Example 1, if we decide on standard variables first, then none of the problems occur. E.g., in , if is fixed to either true or false, then we do not get any wrong stable model cubes. Similarly, in , if we replace the second assignment with which propagates , we still get the same residual program, but in this case, it is correct to use the cached value. Note that stratification is a requirement for all probabilistic logic programs under the distribution semantics [Sato1995]. For such programs given an assignment to standard variables, the wellfounded model of the resulting program is the unique stable model.
3.2 Modifying the Residual Program
In ASP solving, it is often very useful to make decisions on founded variables as it can significantly prune the search space. For this reason, we present a more novel approach to overcome the problem demonstrated in Example 1.
The root problem in Example 1 in both cases is the failure to distinguish between a founded variable being true and being justified, i.e., can be inferred to be true from the rules and current standard and negative literals. In the example, in , and are made true by search (and possibly propagation) but they are not justified as they do not necessarily have externally supporting rules (they are not true under stable model semantics if we set ). In ASP solvers, this is not a problem since the existing unfounded set detection algorithms guarantee that in complete assignments, a variable being true implies that it is justified. This is not valid for partial assignments, which we need for counting stable model cubes. Next, we show that if we define the residual rules (not constraints) of a program in terms of justified subset of an assignment, then we can leverage a propositional model counter augmented with unfounded set detection to correctly compute stable model cubes of a program. In order to formalize and prove this, we need further definitions.
Given a program and a partial assignment , the justified assignment is the subset of that includes all standard and founded negative literals plus all the positive founded literals implied by them using the rules of the program. More formally, let . Then, .
Definition 1.
Given a program and a partial assignment , let and . The justified residual program of , w.r.t. is written and is equal to where , and .
Example 2.
Consider a program with founded variables , standard variables and the following rules and constraints:
Let . Then, and . The justified residual program w.r.t. has all the rules in the first column and has the constraints: {, , }.
Theorem 1.
Given an ASPSAT program and a partial assignment , let be denoted by . Let the remaining variables be and be a complete assignment over . Assume any founded variable for which there is no rule in is false in .

If is a stable model of , then for any assignment over the remaining variables, is a stable model of .

For a given assignment over remaining variables, if is a stable model of , then is a stable model of .
Corollary 2.
Let the set of rules and constraints of decompose into ASPSAT programs where s.t. for any distinct in , . Let the remaining variables be: and let be complete assignments over respectively.

If are stable models of resp., then for any assignment over the remaining variables, is a stable model of .

For a given assignment over remaining variables, if is a stable model of , then is a stable model of for each .
The first part of Theorem 1 shows that we can solve the justified residual program independently (as well as cache the result) and extend any of its stable model to a full stable model by assigning any value to the remaining variables of the original program. The second part of the theorem establishes that any full stable model of the original program is counted since it is an extension of the stable model of the residual program. The corollary tells us that if the justified residual program decomposes into disjoint programs, then we can solve each one of them independently, and multiply their counts to get the count for justified residual program.
Example 3.
In Example 2, the justified residual program has only two stable models: and . It can be verified that the only stable assignments extending of are and where is any assignment on the standard variables . Therefore, the total number of stable models below is .
Now say we have another assignment . It can be seen that it produces the same justified residual program as that produced by for which we know the stable model count is . Furthermore, the set of remaining variables is . Therefore, the number of stable assignments below is .
In order to convert a model counter to a stable model counter, we can either modify its calculation of the residual program as suggested by Theorem 1, or, we can modify the actual program and use its existing calculation in a way that residual of the modified program correctly models the justified residual program. Let us describe one such approach and prove that it is correct. We post a copy of each founded variable and each rule such that the copy variable only becomes true when the corresponding founded variable is justified. More formally, for each founded variable , we create a standard variable , add the constraint , and for each rule where each is a positive founded literal and each is a standard or negative literal, we add the clause . Most importantly, we do not allow search to take decisions on any of these introduced copy variables. Let this transformation of a program be denoted by .
We now show that it is correct to use the above approach for stable model counting. For the following discussion and results, let be an ASPSAT program, be denoted by . Let , , be assignments over and , , be their projections over noncopy variables (). Let (similarly for , ) be a shorthand for . The results assume that assignments , , are closed under unit propagation and unfounded set propagation, i.e., both propagators have been run until fixpoint in the solver.
To prove the results, we define a function that takes the copy program and and maps it to the justified residual program w.r.t. to the projection of that on noncopy variables and then argue that correctly models the justified residual program. Formally, is an ASPSAT program constructed as follows. Add every constraint in that does not have a copy variable in . For every constraint in , add the rule in . Let be the set of founded variables such that is true but is unfixed in . For every in , add the constraint in . Define as variables of and . Proposition 3 proves that we cannot miss any stable model of the original program if we use the copy approach.
Proposition 3.
If cannot be extended to any stable model of , then cannot be extended to any stable model of .
Theorem 4 establishes that we can safely use to emulate the justified residual program . Corollary 5 says that if we detect a stable model cube of , then we also detect a stable model cube of the same size for the justified residual program. This corollary and Proposition 3 prove that the stable model count of the actual program is preserved.
Theorem 4.
.
Corollary 5.
If has no rules or constraints and there are unfixed variables, then is a stable model cube of of size .
The next two corollaries prove that the copy approach can be used for caching dynamic decomposition respectively.
Corollary 6.
If , then .
Corollary 7.
If decomposes into disjoint components , then decomposes into disjoint components such that where is projection of on .
4 Problog2 via Stable Model Counting
In this section, we describe how we apply stable model counting in the probabilistic logic programming solver Problog2 [Fierens et al.2013]. A probabilistic logic program is a collection of mutually independent random variables each of which is annotated with a probability, derived variables, evidence constraints and rules for the derived variables. The distribution semantics [Sato1995] says that for a given assignment over the random variables, the values of the derived variables is given by the wellfounded model. Furthermore, the weight of that world is equal to the product of probabilities of values of the random variables. In our setting, it is useful to think of random variables, derived variables, evidence constraints, and rules as standard variables, founded variables, constraints and rules respectively. Problog2 handles various inference tasks, but the focus of this paper is computing the marginal probability of query atoms given evidence constraints. The probability of a query atom is equal to the sum of weights of worlds where a query atom and evidence are satisfied divided by the sum of weights of worlds where the evidence is satisfied.
Figure 1 shows the execution of a Problog2 program. The input is a nonground probabilistic logic program which is given to the grounder that cleverly instantiates only parts of the program that are relevant to the query atoms, similar to how magic set transformation [Bancilhon et al.1985] achieves the same goal in logic programming. The ground program and the evidence is then converted to CNF using the proof based encoding that we discussed earlier. This CNF is passed on to a knowledge compiler like Dsharp [Muise et al.2012]. Dsharp is an extension of sharpSAT [Thurley2006] where the DPLLstyle search is recorded as dDNNF. The dDNNF produced by the knowledge compiler is given to the parser of Problog2 along with the ground queries and probabilities of the random variables. The parser evaluates the probability of each query by crawling the dDNNF as described in [Fierens et al.2013].
Our contribution is in the components in the dotted box in Figure 1. We have implemented stable model counting by extending the propositional model counter sharpSAT as described in the previous section. Since sharpSAT is part of the knowledge compiler Dsharp, our extension of sharpSAT automatically extends Dsharp to a stable model knowledge compiler. The CNF conversion component in Problog2 chain is replaced by a simple processing of the ground program and evidence to our desired input format. In the first approach where the search is restricted to standard variables, the evidence needs to be passed on to our stable model counter which posts a nogood (the current assignment of standard variables) each time an evidence atom is violated. In approach given in Section 3.2, however, we post each evidence as a unit clause, much like Problog2 does in its CNF conversion step. Including evidence in constraints in the second approach is safe since our residual program relies on the justified assignment only, and propagation on founded literals that makes them true due to constraints does not change that. Outside the dotted box in the figure, the rest of the Problog2 logic remains the same.
5 Experiments
We compare the two approaches based on implementation of unfounded set detection as explained in Section 3 against the proof based encoding of Problog2. We use two wellstudied benchmarks: SmokersFriends [Fierens et al.2011] problem and the graph reliability problem (GraphRel) [Arora and Barak2009] with evidence constraints.
In both problems, the graph is probabilistic. In GraphRel, the nodes are associated with probabilities while in SmokersFriends, the edges have probabilities. Naturally, for nodes, the number of random variables is in and for GraphRel and SmokersFriends respectively. Due to this, GraphRel has significantly more loops per random variables in the dependency graph which makes it more susceptible to the size problems of eager encoding. We refer to the fixed search approach of Section 3.1 as ASProblogS and the proper integration of unfounded set detection through the use of copy variables of Section 3.2 as ASProblog. All experiments were run on a machine running Ubuntu 12.04.1 LTS with 8 GB of physical memory and Intel(R) Core(TM) i72600 3.4 GHz processor.
Instance  Problog2  ASProblog  ASProblogS  
Time  Time  Time  
10  0.5  11.33  2214  7065  199  7.68  1.21  1.08  72  226  233  8.88  13  .057  1.13  60  171  333  8.75  124  .10 
11  0.5  115.75  6601  21899  353  8.61  7.62  1.11  86  283  382  9.76  23  .10  1.12  73  216  354  9.38  107  .10 
12  0.5  —  16210  55244  —  —  —  1.20  101  348  675  10.81  21  .19  1.32  87  267  904  10.47  405  .28 
13  0.5  —  59266  204293  —  —  —  1.41  117  414  1395  12.16  44  .41  2.61  102  320  2737  11.33  1272  1.28 
15  0.5  —  —  —  —  —  —  2.05  142  514  3705  13.42  59  1.23  4.78  125  398  7542  12.88  2028  2.71 
20  0.5  —  —  —  —  —  —  31.82  246  966  83091  18.37  189  38.11  82.21  224  757  143188  18.31  32945  62.02 
25  0.25  —  —  —  —  —  —  22.44  225  800  62871  18.70  231  27.23  53.63  198  620  128534  19.55  41811  43.06 
30  0.1  —  —  —  —  —  —  3.71  168  468  7347  15.89  129  2.99  13.22  137  351  43968  19.31  2833  10.40 
31  0.1  —  37992  115934  —  —  —  2.84  171  473  5054  15.06  52  2.23  12.67  140  356  19585  17.53  1293  11.18 
32  0.1  —  —  —  —  —  —  7.93  185  528  17006  17.06  173  7.75  35.97  153  398  108916  21.42  5405  32.10 
33  0.1  —  —  —  —  —  —  25.13  191  533  67929  18.49  343  31.06  —  157  403  —  —  —  — 
34  0.1  —  —  —  —  —  —  12.97  201  566  33338  19.41  155  14.66  112.27  165  429  324304  23.20  5502  124.21 
35  0.1  —  —  —  —  —  —  101.40  222  663  249512  21.78  1567  123.62  —  186  503  —  —  —  — 
36  0.1  —  —  —  —  —  —  100.20  228  683  279273  21.41  1542  124.73  —  190  518  —  —  —  — 
37  0.1  —  —  —  —  —  —  65.86  227  659  159056  20.55  658  77.57  —  188  499  —  —  —  — 
38  0.1  —  —  —  —  —  —  —  240  712  —  —  —  —  —  200  540  —  —  —  — 
Table 1 shows the comparison between Problog2, ASProblog and ASProblogS on GraphRel on random directed graphs. The instance is specified by , the number of nodes, and , the probability of an edge between any two nodes. The solvers are compared on the following parameters: time in seconds (Time), number of variables and clauses in the input program of Dsharp ( and resp.), number of decisions (), average decision level of backtrack due to conflict or satisfaction (), the size in megabytes of the dDNNF produced by Dsharp (), and for ASProblog and ASProblogS, the number of loops produced during the search (). Each number in the table represents the median value of that parameter from 10 random instances of the size in the row. The median is only defined if there are at least (6) output values. A ‘—’ represents memory exhaustion or a timeout of 5 minutes, whichever occurs first. A ‘—’ in columns Time, , , , means that the solver ran out of memory but the grounding and encoding was done successfully, while a ‘—’ in all columns of a solver means that it never finished encoding the problem. We show the comparison on three types of instances: small graphs with high density, medium graphs with high to medium density, and large graphs with low density.
Clearly ASProblog and ASProblogS are far more scalable than Problog2. While Problog2 requires less search (since it starts with all loop formulae encoded) the overhead of the eager encoding is prohibitive. For all solved instances, ASProblog has the best running time and dDNNF size, illustrating that the search restriction of ASProblogS degrades performance significantly. While the encoding for ASProblogS is always smallest, the encoding with copy variables and rules of ASProblog is not significantly greater, and yields smaller search trees and fewer loop formulae. It is clearly the superior approach.
Figure 2 compares the performance of Problog2 and ASProblog on SmokersFriends when the number of random variables is fixed to 31 and the problem size is increased. In the problem description, there are two sets of random variables, the and the variables. The first one exists for each person in the graph, while the latter exists for every edge in the graph. In our setting, for an instance with persons, the number of random variables is equal to . The rest of the variables are fixed to true or false at run time. For the smallest instances of sizes 7 and 8, Problog2 and ASProblog have similar performance. For instances 9 to 12, Problog2 does better than ASProblog where the latter cannot solve instances 11 and 12 due to memory exhaustion. The reason is that the complete encoding in Problog2 propagates better and the extra unfounded set check at each node in the search tree in ASProblog does not pay off. But as the number of people increases and the number of probabilistic edges becomes less, the problem becomes easier for ASProblog but not for Problog2. The reason is that by fixing the probabilistic edges, we are just left with external rules, and many internal rules, making many founded variables logically equivalent to each other. In the last instance, the number of loop formulas required for the problem is only one! Our lazy approach benefits from this structure in the problem, while Problog2 does not. Our experiments with the same range of instances but with number of random variables fixed to 33 and 35 show similar behaviour of Problog2 and ASProblog where initially, Problog2 does better, followed by hard instances for both, and finally, ASProblog detecting the structure and solving the last few instances in less than 2 seconds.
6 Conclusion
Stable model counting is required for reasoning about probabilistic logic programs with positive recursion in their rules. We demonstrate that the current approach of translating logic programs eagerly to propositional theories is not scalable because the translation explodes when there is a large number of recursive rules in the ground program. We give two methods to avoid this problem which enables reasoning about significantly bigger probabilistic logic programs.
References
 [Arora and Barak2009] Arora, S., and Barak, B. 2009. Computational complexity  a modern approach (Chapter: Complexity of Counting).

[Bacchus, Dalmao, and
Pitassi2003]
Bacchus, F.; Dalmao, S.; and Pitassi, T.
2003.
DPLL with Caching: A new algorithm for #SAT and Bayesian Inference.
Electronic Colloquium on Computational Complexity (ECCC) 10(003).  [Bancilhon et al.1985] Bancilhon, F.; Maier, D.; Sagiv, Y.; and Ullman, J. D. 1985. Magic sets and other strange ways to implement logic programs. In Proceedings of the fifth ACM SIGACTSIGMOD symposium on Principles of database systems, 1–15. ACM.

[Fierens et al.2011]
Fierens, D.; Van den Broeck, G.; Thon, I.; Gutmann, B.; and De Raedt, L.
2011.
Inference in probabilistic logic programs using weighted CNF’s.
In Gagliardi Cozman, F., and Pfeffer, A., eds.,
Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, 2011
, 211–220.  [Fierens et al.2013] Fierens, D.; den Broeck, G. V.; Renkens, J.; Shterionov, D. S.; Gutmann, B.; Thon, I.; Janssens, G.; and Raedt, L. D. 2013. Inference and learning in probabilistic logic programs using weighted boolean formulas. CoRR abs/1304.6810.
 [Gebser et al.2007] Gebser, M.; Kaufmann, B.; Neumann, A.; and Schaub, T. 2007. Conflictdriven answer set solving. In Proceedings of the 20th International Joint Conference on Artificial Intelligence, 386. MIT Press.
 [Gebser, Kaufmann, and Schaub2012] Gebser, M.; Kaufmann, B.; and Schaub, T. 2012. Conflictdriven answer set solving: From theory to practice. Artificial Intelligence 187:52–89.
 [Gelfond and Lifschitz1988] Gelfond, M., and Lifschitz, V. 1988. The stable model semantics for logic programming. In ICLP/SLP, 1070–1080.
 [Gomes, Sabharwal, and Selman2008] Gomes, C. P.; Sabharwal, A.; and Selman, B. 2008. Model counting.
 [Janhunen et al.2007] Janhunen, T.; Oikarinen, E.; Tompits, H.; and Woltran, S. 2007. Modularity aspects of disjunctive stable models. In Logic Programming and Nonmonotonic Reasoning, 9th International Conference, LPNMR 2007, Tempe, AZ, USA, May 1517, 2007, Proceedings, 175–187.
 [Janhunen2004] Janhunen, T. 2004. Representing normal programs with clauses. In Proceedings of the 16th Eureopean Conference on Artificial Intelligence, ECAI’2004.
 [Leone et al.2006] Leone, N.; Pfeifer, G.; Faber, W.; Eiter, T.; Gottlob, G.; Perri, S.; and Scarcello, F. 2006. The DLV system for knowledge representation and reasoning. ACM Trans. Comput. Log. 7(3):499–562.
 [Lifschitz and Razborov2006] Lifschitz, and Razborov. 2006. Why are there so many loop formulas? ACM Transactions on Computational Logic 7(2):261–268.
 [Lin and Zhao2004] Lin, F., and Zhao, Y. 2004. ASSAT: computing answer sets of a logic program by SAT solvers. Artificial Intelligence 157(12):115–137.
 [Muise et al.2012] Muise, C. J.; McIlraith, S. A.; Beck, J. C.; and Hsu, E. I. 2012. DSHARP: Fast dDNNF Compilation with sharpSAT. In Canadian Conference on AI, 356–361.
 [Raedt and Kimmig2013] Raedt, L. D., and Kimmig, A. 2013. Probabilistic programming concepts. In Programming Languages, Computer Science, arXiv.
 [Sang et al.2004] Sang, T.; Bacchus, F.; Beame, P.; Kautz, H. A.; and Pitassi, T. 2004. Combining component caching and clause learning for effective model counting. In Proceedings of the 7th International Conference on Theory and Applications of Satisfiability Testing (SAT2004).
 [Sato1995] Sato, T. 1995. A statistical learning method for logic programs with distribution semantics. In ICLP, 715–729. MIT Press.
 [Thurley2006] Thurley, M. 2006. sharpSAT  Counting Models with Advanced Component Caching and Implicit BCP. In SAT, 424–429.
 [Valiant1979] Valiant, L. G. 1979. The complexity of enumeration and reliability problems. 410–421.
 [Van Gelder, Ross, and Schlipf1988] Van Gelder, A.; Ross, K. A.; and Schlipf, J. S. 1988. Unfounded sets and wellfounded semantics for general logic programs. In Proceedings of the ACM Symposium on Principles of Database Systems, 221–230. ACM.
 [Vlasselaer et al.2014] Vlasselaer, J.; Renkens, J.; den Broeck, G. V.; and Raedt, L. D. 2014. Compiling probabilistic logic programs into sentential decision diagrams. In Workshop on Probabilistic Logic Programming (PLP), Vienna.
Appendix A Proofs of theorems and their corollaries
Theorem 1.
Given an ASPSAT program and a partial assignment , let be denoted by . Let the remaining variables be and be a complete assignment over . Assume any founded variable for which there is no rule in is false in .

If is a stable model of , then for any assignment over the remaining variables, is a stable model of .

For a given assignment over remaining variables, if is a stable model of , then is a stable model of .
Proof.
Let . Note that there cannot be a founded variable in the remaining variables since if a founded variable is not true in and does not have a rule in , then it must be false in due to the given assumption.
The key point is to view as two separate sets of rules, and and argue that we can treat them separately for the purpose of least models. If there is any assignment that extends , then from , we can safely delete the rules whose bodies intersect with or as these rules are redundant since all assignments in are sufficient to imply the founded literals in . Furthermore, the least assignment of reduct of w.r.t. any assignment that extends will be exactly equal to the founded literals in . Therefore:
Since does not have any founded variables and we just argued that we can delete the rules in that have any variable from and is completely disjoint from by definition, we can simplify the above equality to: .

We are given that is a stable model of , i.e., , and . It is easy to show that this implies that . Moreover, from the above equality, we get . Since, due to constraints added in , and are consistent on founded variables in , we get . It is easy to see that any assignment can be used to extend without affecting satisfiability or the least model, which means that is a stable model of .

We are given that is a stable model of . By definition of residual programs, we know that and the intersection of variables in and is empty, which means that if is nonempty, then is not sufficient to satisfy which implies that (if is empty, then trivially ). A similar argument for the case can be made. Given that and are consistent, we can also see that which means that . For least model, from the equality that we discussed previously, we are given that . Again, since , we can derive that which means that is a stable model of .
∎
Corollary 2.
Let the set of rules and constraints of decompose into ASPSAT programs where s.t. for any distinct in , . Let the remaining variables be: and let be complete assignments over respectively.

If are stable models of resp., then for any assignment over the remaining variables, is a stable model of .

For a given assignment over remaining variables, if is a stable model of , then is a stable model of for each .
Proof.
Theorem 1 says that the justified residual program can be solved in isolation and its results can be combined with the parent program. Furthermore, since all ASP programs are completely disjoint, both 1 and 2 follow from the Module Theorem (Theorem 1) as given in [Janhunen et al.2007] which says that two mutually compatible assignments that are stable models of two respective programs can be joined to form a stable model of their union program and conversely, a stable model of the combined program can be split into stable models of individual programs, as long as there are no positive interdependencies between the two programs. Ours is a simple special case of Corollary 1 in [Janhunen et al.2007] where all programs and their sets of variables are completely disjoint. ∎
For the following discussion and results, let be an ASPSAT program, be denoted by . Let , , be assignments over and , , be their projections over noncopy variables (). Let (similarly for , ) be a shorthand for . The results assume that assignments , , are closed under unit propagation and unfounded set propagation, i.e., both propagators have been run until fixpoint in the solver.
To prove the results, we define a function that takes the copy program and and maps it to the justified residual program w.r.t. to the projection of that on noncopy variables and then argue that correctly models the justified residual program. Formally, is an ASPSAT program constructed as follows. Add every constraint in that does not have a copy variable in . For every constraint in , add the rule in . Let be the set of founded variables such that is true but is unfixed in . For every in , add the constraint in . Define as variables of and .
Proposition 3.
If cannot be extended to any stable model of , then cannot be extended to any stable model of .
Proof sketch.
Say has an extension that is a stable model of . We can show that running unit propagation on and yields a solution of that is an extension of , which contradicts what is given. ∎
Theorem 4.
.
Proof.
Recall the definition of justified assignment: . Also recall that any copy constraint in has the form: and by definition, each is the copy of a rule in , which is . Recall that are copy variables of respectively, are positive literals in and are either standard or negative literals in . We show that the sets of rules, constraints, and variables of and are equal, therefore, they are equal. We begin by reasoning about the sets of rules.
Let us focus on the seed of the justified assignment . It is easy to see that . Since and share all standard and negative literals, their residuals w.r.t. these literals will be simplified in exactly the same way, i.e., if , then . The core point of the proof is that running unit propagation on the set of copy rules is analogous to computing . Each application of unit propagation on that derives must also derive in . This means that if, due to this propagation, the set of copy variables is derived, then . Furthermore, since no decisions on copy variables are allowed, and there is no other constraint that can possibly derive a literal , unit propagation cannot derive any other positive copy literal. Now, let us view as individual applications of a function on each copy rule to produce . We can see that . Therefore, the set of rules in and is exactly the same.
From above, it also follows that the set in the construction of and the set in Definition 1 are equal. Since is just the restriction of on noncopy variables, the residual of any constraint that has noncopy variables only is the same and since the consistency constraints due to are also the same, the set of constraints and are equal. Since the rules and constraints are equal in and , their variables are also equal. ∎
Corollary 5.
If has no rules or constraints and there are unfixed variables, then is a stable model cube of of size .
Proof.
Since is empty, is also empty, and since , this means that is a stable model cube. Furthermore, we can show that all founded variables and their copies must be fixed in , which means all variables must be standard variables, therefore, the size of stable model cube of is . ∎
Corollary 6.
If , then .
Proof.
Follows directly from Theorem 4 and definition of function. ∎
Corollary 7.
If decomposes into disjoint components , then decomposes into disjoint components such that where is projection of on .
Proof.
Note that the disjointness of components of is really determined by its constraints, and not its rules. The residual rules are always stronger (have fewer variables) than their respective residual copy constraints, because it is possible that a founded variables is true in but its copy variable is unfixed. The opposite is not possible and moreover, standard and negative literals are shared in a rule and its corresponding copy constraint. This property of residual rules is important since works by projecting each individual copy rule to its original form in and completely ignores the residual set of rules in .
It is clear from definition of that the projection of each rule merely replaces the copy variables with their corresponding founded variables. This means that we can take each disjoint component of and map it to its counterpart in using the projection function. Noncopy constraint translation in is also straightforward and cannot combine multiple disjoint components in to one component in or split one component in to multiple components in . Finally, addition of the unary clauses for all variables in (as used in the definition of ) also does not affect the components; by definition, a variable in must have at least one copy constraint which will be projected to in . Adding as a constraint does not affect the component in which this projected rule appears in . ∎