Symmetric Weighted First-Order Model Counting

12/03/2014 ∙ by Paul Beame, et al. ∙ University of Washington 0

The FO Model Counting problem (FOMC) is the following: given a sentence Φ in FO and a number n, compute the number of models of Φ over a domain of size n; the Weighted variant (WFOMC) generalizes the problem by associating a weight to each tuple and defining the weight of a model to be the product of weights of its tuples. In this paper we study the complexity of the symmetric WFOMC, where all tuples of a given relation have the same weight. Our motivation comes from an important application, inference in Knowledge Bases with soft constraints, like Markov Logic Networks, but the problem is also of independent theoretical interest. We study both the data complexity, and the combined complexity of FOMC and WFOMC. For the data complexity we prove the existence of an FO^3 formula for which FOMC is #P_1-complete, and the existence of a Conjunctive Query for which WFOMC is #P_1-complete. We also prove that all γ-acyclic queries have polynomial time data complexity. For the combined complexity, we prove that, for every fragment FO^k, k≥ 2, the combined complexity of FOMC (or WFOMC) is #P-complete.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Probabilistic inference is becoming a central data management problem. Large knowledge bases, such as Yago [19], Nell [2], DeepDive [6], Reverb [11], Microsoft’s Probase [43] or Google’s Knowledge Vault [8], have millions to billions of uncertain tuples. These systems scan large corpora of text, such as the Web or complete collections of journal articles, and extract automatically billions of structured facts, representing large collections of knowledge. For an illustration, Google’s Knowledge Vault [8] contains 1.6B triples of the form (subject, predicate, object), for example, </m/02mjmr, /people/person/place_of_birth /m/02hrh0_> where /m/02mjmr is the Freebase id for Barack Obama, and /m/02hrh0_ is the id for Honolulu [8]

. The triples are extracted automatically from the Web, and each triple is annotated with a probability

representing the confidence in the extraction.

A central and difficult problem in such systems is probabilistic inference, or, equivalently weighted model counting. The classical FO Model Counting problem (FOMC) is: given a sentence in First-Order Logic (FO) and a number , compute the number of structures over a domain of size that satisfy the sentence ; in this paper we consider only labeled structures, i.e. isomorphic structures are counted as distinct. We denote the number of models by , for example .111For a fixed , there are assignments to , which all satisfy , except the one where all atoms are false. Moreover, the models for the values of can be counted independently and multiplied. In the Weighted FO Model Counting (WFOMC) variant, one further associates a real number called weight to each tuple over the domain of size , and defines the weight of a structure as the product of the weights of all tuples in that structure. The Weighted Model Count is defined as the sum of the weights of all structures over a domain of size that satisfy the sentence . Weights map immediately to probabilities, in the following way: if each tuple is included in the database independently with probability , then the probability that a formula is true is , where is the sum of weights of all structures.

In this paper we study the symmetric WFMOC problem, where all tuples from the same relation have the same weight, which we denote . For example, a random graph is a symmetric structure, since every edge is present with the same probability (equivalently: has weight ), and FOMC is another special case where all weights are set to 1. The symmetric WFMOC problem occurs naturally in Knowledge Bases with soft constraints, as we illustrate next.

Example .

A Markov Logic Network (MLN) [7] is a finite set of soft or hard constraints. Each constraint is a pair , where is a formula, possibly with free variables , and is a weight222In typical MLN systems, users specify the log of the weight rather than the weight. The pair means that the weight of is . Using logs simplifies the learning task. We do not address learning and will omit logs; means that has weight .. For example, the soft constraint

(1)

specifies that, typically, a female’s spouse is male, and associates the weight to this constraint. If then we call a hard constraint.

The semantics of MLNs naturally extend the Weighted Model Counting setting. Given a finite domain (set of constants), an MLN defines a probability distribution over all structures for that domain (also called

possible worlds). Every structure has a weight

In other words, for each soft constraint , and for every tuple of constants such that holds in , we multiply ’s weight by . For example, given the MLN that consists only of the soft constraint (1), the weight of a world is , where is the number of pairs of constants for which holds in . The weight of a sentence is defined as the sum of weights of all worlds that satisfy both and all hard constraints in the MLN; its probability is obtained by normalizing . Notice that the symmetric WFOMC problem corresponds to the special case of an MLN consisting of one soft constraint for each relation , where .

Today’s MLN systems (Alchemy [26], Tuffy [30, 44]) use an MCMC algorithm called MC-SAT [31] for probabilistic inference. The theoretical convergence guarantees of MC-SAT require access to a uniform sampler over satisfying assignments to a set of constraints. In practice, MC-SAT implementations rely on SampleSAT [42]

, which provides no guarantees on the uniformity of solutions. Several complex examples are known in the literature where model counting based on SampleSAT leads to highly inaccurate estimates  

[16].

A totally different approach to computing is to reduce it to a symmetric  [39, 15, 37, 22], and this motivates our current paper. We review here briefly one such reduction, adapting from  [22, 37].

Example .

Given an MLN, replace every soft constraint by two new constraints: and . Here is a new relational symbol with the same arity as the number of free variables in , and the constraint defines as a relation where all tuples have weight . Therefore, the probability of a formula in the MLN can be computed as a conditional probability over a symmetric, tuple-independent database: , where is the conjunction of all hard constraints333The reason why this works is the following: in original MLN, each tuple contributes to a factor of 1 or , depending on whether is false or true in ; after the rewriting, the contribution of is when is false, because in that case must be true, or when is true, because can be either false or true. The ratio is the same . . Note that this reduction to WFOMC is independent of the finite domain under consideration.

For example, the soft constraint in (1) is translated into the hard constraint:

and a tuple-independent probabilistic relation where all tuples have weight , or, equivalently, have probability .

Thus, our main motivation for studying the symmetric is very practical, as symmetric models have been extensively researched in the AI community recently, for inference in MLNs and beyond [24, 39, 29, 41]. Some tasks on MLNs, such as parameter learning [38], naturally exhibit symmetries. For others, such as computing conditional probabilities given a large “evidence” database, the symmetric model is applicable when the database has bounded Boolean rank [36]. Moreover, the problem is of independent theoretical interest as we explain below. We study both the data complexity, and the combined complexity. In both settings we assume that the vocabulary is fixed, and so are the weights associated with the relations. In data complexity, the formula is fixed, and the only input is the number representing the size of the domain. In this case is a counting problem over a unary alphabet: given an input , compute . It is immediate that this problem belongs to the class #P, which is the set of #P problems over a unary input alphabet [34]. In the combined complexity, both and the formula are input.

In this paper we present results on the data complexity and the combined complexity of the FOMC and WFOMC problem, and also some results on the associated decision problem.

Results on Data Complexity

In a surprising result [37] has proven that for FO the data complexity of symmetric is in PTIME (reviewed in Appendix C).444PTIME data complexity for symmetric is called domain-liftability in the AI and lifted inference literature [35]. This is surprising because FO (the class of FO formulas restricted to two logical variables) contains many formulas for which the asymmetric problem was known to be #P-hard. An example is , which is #P-hard over asymmetric structures, but the number of models is555Fix the relations , and let their cardinalities be and . Then the structure does not satisfy iff contains none of the tuples in , proving the formula. , which is a number computable in time polynomial in .666Tractability of was noted before in, for example [32, 35]. More generally, the symmetric WFOMC problem for is in PTIME.

This begs the question: could it be the case that every FO formula is in PTIME? The answer was shown to be negative by Jaeger and Van den Broeck [21, 20], using the following argument. Recall that the spectrum, , of a formula is the set of numbers for which has a model over a domain of size  [9]. Jaeger and Van den Broeck observed that the spectrum membership problem, “is ?”, can be reduced to WFOMC, by checking whether . Then, using a result in [23], if , then there exists a formula for which computing is not in polynomial time777Recall that and , and are not to be confused with the more familiar classes EXPTIME and NEXPTIME, which are and respectively.. However, no hardness results for the symmetric were known to date.

What makes the data complexity of the symmetric WFOMC difficult to analyze is the fact that the input is a single number . Valiant already observed in [34] that such problems are probably not candidates for being #P-complete. Instead, he defined the complexity class #P, to be the set of counting problems for NP computations over a single-letter input alphabet. Very few hardness results are known for this class: we are aware only of a graph matching problem that was proven by Valiant, and of a language-theoretic problem by Bertoni and Goldwurm [1].

Our data complexity results are the following. First, we establish the existence of an FO sentence for which the data complexity of the problem is #P-hard; and we also establish the existence of a conjunctive query for which the data complexity of the problem is #P-hard. Second, we prove that every -acyclic conjunctive query without self-joins is in polynomial time, extending the result in [37] from FO to -acyclic conjunctive queries. We give now more details about our results, and explain their significance.

The tractability for FO [37] raises a natural question: do other restrictions of FO, like FO for , also have polynomial data complexity? By carefully analyzing the details of the construction of we prove that it is actually in FO. This implies a sharp boundary in the FO hierarchy where symmetric WFOMC transitions from tractable to intractable: for between 2 and 3. The tractability of -acyclic queries raises another question: could all conjunctive queries be tractable for symmetric ? We answer this also in the negative: we prove that there exists a conjunctive query for which the symmetric problem is #P-hard. It is interesting to note that the decision problem associated to , namely given , does ? is trivial for conjunctive queries, since every conjunctive query has a model over any domain of size . Therefore, our #P-hardness result for is an instance where the decision problem is easy while the corresponding weighted counting problem is hard. We note that, unlike WFOMC, we do not know the exact complexity of the unweighted, FOMC problem for conjunctive queries.

0-1 Laws

Our data complexity hardness result sheds some interesting light on 0-1 laws. Recall that, if is a class of finite structures and is a property over these structures, then denotes the fraction of labeled888The attribute labeled means that isomorphic structures are counted as distinct; 0-1 laws for unlabeled structures also exist. In this paper, we discuss labeled structures only. structures in over a domain of size that satisfy the property  [27]. A logic has a 0-1 law over the class of structures , if for any property expressible in that logic, is either 0 or 1. Fagin [13] proved a 0-1 law for First-Order logic and all structures, by using an elegant transfer theorem: there exists a unique, countable structure , which is characterized by an infinite set of extension axioms, . He proved that, for every extension axiom, , and this implies if is true in , and if is false in . Compton [3] proved 0-1 laws for several classes of structures . A natural question to ask is the following: does there exists an elementary proof of the 0-1 laws, by computing a closed formula for every , then using elementary calculus to prove that that converges to 0 or 1? For example, if , then and ; can we repeat this argument for every ? On a historical note, Fagin confirms in personal communication that he originally tried to prove the 0-1 law by trying to find such a closed formula, which failed as an approach. Our #P-result for FO proves that no such elementary proof is possible, because no closed formula for can be computed in general (unless #P is in PTIME).

Results on the Combined Complexity

Our main result on the combined complexity is the following. We show that, for any , the combined complexity of FOMC for FO is #P-complete; membership is a standard application of Scott’s reduction, while hardness is by reduction from the model counting problem for Boolean formulas. Recall that the vocabulary is always assumed to be fixed: if it were allowed to be part of the input, then every Boolean formula is a special case of an FO formula, by creating a new relational symbol of arity zero for each Boolean variable, and all hardness results for Boolean formulas carry over immediately to FO.

The Associated Decision Problem

We also discuss and present some new results on the decision problem associated with (W)FOMC: “given , , does have a model over a domain of size ?”. The data complexity variant is, of course, the spectrum membership problem, which has been completely solved by Jones and Selman [23], by proving that the class of spectra coincides with NETIME, that is, . Their result assumes that the input is represented in binary, thus the input size is . In this paper we are interested in the unary representation of , as , which is also called the tally notation, in which case case NETIME naturally identifies with NP. Fagin proved that, in the tally notation,  [12, Theorem 6, Part 2].

For the decision problem, our result is for the combined complexity: given both , does ? We prove that this problem is NP-complete for FO, and PSPACE-complete for FO. The first of these results has an interesting connection to the finite satisfiability problem for FO, which we discuss here. Recall the classical satisfiability problem in finite model theory: “given a formula does it have a finite model?”, which is equivalent to checking . Grädel, Kolaitis and Vardi [17] have proven the following two results for FO: if a formula is satisfiable then it has a finite model of size at most exponential in the size of the sentence , and deciding whether is satisfiable is NEXPTIME-complete in the size of . These two results already prove that the combined complexity for deciding cannot be in polynomial time: otherwise, we could check satisfiability in EXPTIME by iterating from to exponential in the size of , and checking . Our result settles the combined complexity, proving that it is NP-complete.

The paper is organized as follows: we introduce the basic definitions in Section 2, present our results for the data complexity of the FOMC and WFOMC problems in Section 3, present all results on the combined complexity in Section 4, then conclude in Section 5.

2 Background

We review here briefly the main concepts, some already introduced in Section 1.

Weighted Model Counting (WMC)

The Model Counting problem is: given a Boolean formula , compute the number of satisfying assignments . In Weighted Model Counting we are given two real functions associating two weights to each variable in . The weighted model count is defined as:

(2)

where, :

(3)

The model count is a special case .

The standard definition of WMC in the literature does not mention , instead sets ; as we will see, our extension is non-essential. When , then we simply drop from the notation, and write instead of . In the probability computation problem, each variable is set to true with some known probability , and we want to compute , the probability that is true. All these variations are equivalent, because of the following identities:

(4)

Throughout the paper we write for the constant function with value 1, and , and for functions and resp.

Weighted First-Order Model Counting (WFOMC)

Consider FO formulas over a fixed relational vocabulary and equality . Given a domain size , denote the set of ground tuples (i.e., ground atoms without equality) over the domain, thus . The lineage of an FO sentence refers to a Boolean function over (a ground FO sentence), as well as the corresponding Boolean function over propositional variables referring to ground tuples (a propositional sentence). It is defined inductively by for ground tuples , , for , , and , . For any fixed sentence , the size of its lineage is polynomial in . Given a domain size and weight functions , the Weighted First-Order Model Count of is .

Symmetric WFOMC

In the symmetric WFOMC, the weight of a tuple depends only on the relation name and not on the domain constants. We call a weighted vocabulary a triple where is a relational vocabulary and , represent the weights (real numbers) for the relational symbols. For any domain size , we extend these weights to by setting and , and we define . Throughout this paper we assume that WFOMC refers to the symmetric variant, unless otherwise stated.

For a simple illustration, consider the sentence . Then , because the sum of the weights of all possible worlds is , and we have to subtract the weight of the world where . For another example, consider . The reader may check that . In particular, over a domain of size , the formula has models (by setting ).

Problem Weights for , , and tuples Solution for
Symmetric FOMC
Symmetric WFOMC , , ,

where
Asymmetric WFOMC

depend on
#P-hard for  [4]
Table 1: Three variants of WFOMC, of increasing generality, illustrated on the sentence . This paper discusses the symmetric cases only.
Data Complexity and Combined Complexity

We consider the weighted vocabulary fixed. In the data complexity, we fix and study the complexity of the problem: given , compute . In the combined complexity, we study the complexity of the problem: given , compute . All our upper bounds continue to hold if the weights are part of the input. We also consider the data- and combined-complexity of the associated decision problem (where we ignore the weights) given , does have a model over a domain of size ?

Weights and Probabilities

While in practical applications the weights are positive real numbers, and the probabilities are numbers in , in this paper we impose no restrictions on the values of the weights and probabilities. The definition (2) of applies equally well to negative weights, and, in fact, to any semiring structure for the weights [25]. There is, in fact, at least one application of negative probabilities [22], namely the particular reduction from MLNs to WFOMC described in Section 1: a newly introduced relation has weight , which is negative when . Then, the associated probability belongs to .

As a final comment on negative weights, we note that the complexity of the symmetric WFOMC problem is the same for arbitrary weights as for positive weights. Indeed, the expression is a multivariate polynomial in variables , where each variable has degree . The polynomial has real coefficients. Given access to an oracle computing this polynomial for arbitrary positive values for , we can compute in polynomial time all coefficients with as many calls to the oracle; once we know the coefficients we can compute the polynomial at any values , positive or negative.

For all upper bounds in this paper we assume that the weights , or probabilities , are given as rational numbers represented as fractions of two integers of bits each. We assume w.l.o.g. that all fractions have the same denominator: this can be enforced by replacing the denominators by their least common multiplier, at the cost of increasing the number of bits of all integers to at most . It follows that the weight of a world (Eq.(3)) and can be represented as ratios of two integers, each with bits.

Summary

Table 1 summarizes the taxonomy and illustrates the various weighted model counting problems considered in this paper. Throughout the rest of the paper, FOMC and WFOMC refer to the symmetric variant, unless otherwise mentioned.

3 Data Complexity

Recall that the language FO consists of FO formulas with at most distinct logical variables.

3.1 Lower Bounds

Our first lower bound is for an FO sentence:

Theorem 3.1.

There exists an FO sentence, denoted , s.t. the FOMC problem for is #P-complete.

Van den Broeck et al. [37] have shown that the Symmetric WFOMC problem for every FO formula has polynomial time data complexity (the proof is reviewed in Appendix C); Theorem 3.1 shows that, unless #P is in PTIME, the result cannot extend to FO for .

Our second lower bound is for a conjunctive query, or, dually, a positive clause without equality. Recall that a clause is a universally quantified disjunction of literals, for example . A positive clause is a clause where all relational atoms are positive. A conjunctive query (CQ) is an existentially quantified conjunction of positive literals, e.g. . Positive clauses without the equality predicate are the duals of CQs, and therefore the WFOMC problem is essentially the same for positive clauses without equality as for CQs. Note that the dual of a clause with the equality predicate is a CQ with , e.g. the dual of is .

Corollary .

There exists a positive clause without equality s.t. the Symmetric WFOMC problem for is #P-hard. Dually, there exists a CQ s.t. the Symmetric WFOMC problem for is #P-hard.

Corollary 3.1 shows that the tractability result for -acyclic conjunctive queries (discussed below in Theorem 3.2) cannot be extended to all CQs. The proof of the Corollary follows easily from three lemmas, which are of independent interest, and which we present here; the proofs of the lemmas are in the appendix. We say that a vocabulary extends if , and that a weighted vocabulary extends if and the tuples extend .

Lemma .

Let be a weighted vocabulary and an FO sentence over . There exists an extended weighted vocabulary and sentence over , such that is in prenex-normal form with a quantifier prefix , and for all .

This lemma was proven by [37], and says that all existential quantifiers can be eliminated. The main idea is to replace a sentence of the form by , where is a new relational symbol of arity and with weights . For every value , in a world where holds, holds too and the new symbol contributes a factor to the weight; in a world where does not hold, then may be true or false, and the weights of the two worlds cancel each other out.

Note that the lemma tells us nothing about the model count of and , since in we are forced to set some negative weights. If we had , then we could reduce the satisfiability problem for an arbitrary FO sentence to that for a sentence with a quantifier prefix, which is impossible, since the former is undecidable while the latter is decidable.

The next lemma, also following the proof in [37], says that all negations can be eliminated.

Lemma .

Let be a weighted vocabulary and a sentence over in prenex-normal form with quantifier prefix . Then there exists an extended weighted vocabulary and a positive FO sentence over , also in prenex-normal form with quantifier prefix , s.t.  for all .

The idea is to create two new relational symbols for every negated subformula , replace the formula by , and add the sentence . By setting the weights , we ensure that, for every constant , either , in which case is forced to be true and the two new symbols contribute a factor to the weight, or , in which case can be either true or false, and the weights cancel out.

Finally, we remove the predicate.

Lemma .

Let be a weighted vocabulary and a sentence over . Then there exists an extended vocabulary and sentence without the equality predicate , such that, for all , can be computed in polynomial time using calls to an oracle for , where is an extension of .

The idea is to introduce a new relational symbol , replace every atom with , and add the sentence . Let be the extension of with , . Then is a polynomial of degree in where each monomial has degree in , because the hard constraint forces . Moreover, the coefficient of is precisely , because that corresponds to the worlds where , hence it coincides with . We compute this coefficient using calls to an oracle for .

Now we give the proof of Corollary 3.1. Starting with the #P-complete sentence , we apply the three lemmas and obtain a positive sentence , with quantifier prefix and without the equality predicate, that is #P-hard. We write it as a conjunction of clauses, (recall that a clause is universally quantified), and then apply the inclusion-exclusion formula: . Since any disjunction of clauses is equivalent to a single clause, we have reduced the computation problem to computing the probabilities of clauses. By duality, this reduces to computing the probabilities of conjunctive queries, . We can reduce this problem to that of computing the probability of a single conjunctive query , by the following argument. Create copies of the relational symbols in the FO vocabulary, and take the conjunction of all queries, where each query uses a fresh copy of the vocabulary. Then , because now every two distinct queries have distinct relational symbols. Using an oracle to compute the probability of , we can compute any by setting to 1 the probabilities of all relations occurring in , for : in other words, the only possible world for a relation in is one where is the cartesian product of the domain; assuming , is true, , and hence . We repeat this for every and compute . This proves that the CQ is #P-hard. Its dual, , is a #P-hard positive clause without equality. This proves Corollary 3.1.

3.2 Upper Bounds

A CQ is without self-joins if all atoms refer to distinct relational symbols. It is standard to associate a hypergraph with CQs, where the variables are nodes, and the atoms are hyper-edges. We define a -acyclic conjunctive query to be a conjunctive query w/o self-joins whose associated hypergraph is -acyclic. We prove:

Theorem 3.2.

The data complexity of Symmetric WFOMC for -acyclic CQs is in PTIME.

Fagin’s definition of -acyclic hypergraphs [14] is reviewed in the proof of Theorem 3.2.

An open problem is to characterize the conjunctive queries without self-joins that are in polynomial time. While no such query has yet been proven to be hard ( in Corollary 3.1 has self-joins), it is widely believed that, for any , the symmetric WFOMC problem for a typed cycle of length , , is hard.

Figure 1: A summary of data complexity results for conjunctive queries (or positive clauses). -hardness is an informal concept described in the main text.

We discuss here several insights into finding the tractability border for conjunctive queries, summarized in Figure 1.

This boundary does not lie at -acyclicity: the query is -cyclic (with cycle ; see Fagin [14]), yet it still has PTIME data complexity. The key observation is that -cycles allow the last variable to appear in all predicates, turning it into a separator variable [5], hence , which is by symmetry; is isomorphic to the query in Table 1 and can be computed in polynomial time. A weaker notion of acyclicity, called jtdb (for join tree with disjoint branches), can be found in [10]. It also does not characterize the tractability boundary: jtdb contains the -cyclic query above, but it does not contain the PTIME query .

Fagin [14] defines two increasingly weaker notions of acyclicity: - and -acyclic. -Acyclic queries are as hard as any conjunctive query without self-joins. Indeed, if is a conjunctive query w/o self-joins, then the query is -acyclic, where is a new relational symbol, containing all variables of . By setting the probability of to 1, we have . Thus, if all -acyclic queries have PTIME data complexity, then all conjunctive queries w/o self-joins have PTIME data complexity.

For all we know, -acyclic queries could well coincide with the class of tractable conjunctive queries w/o self joins. We present here some evidence that all -cyclic queries are hard, by reduction from typed cycles, . For that, we need to consider a slight generalization of WFOMC for conjunctive queries w/o self-joins, were each existential variable ranges over a distinct domain, of size : the standard semantics corresponds to the special case where all domains sizes are equal. We prove that for any -cyclic query , there exists such that can be reduced to . Hence, the existence of a -cyclic query with PTIME data complexity would imply PTIME data complexity for at least one (informally called -hardness in Fig. 1). The reduction is as follows. By definition, a -cyclic query contains a weak -cycle [14] of the form , where , all and are distinct, each occurs in both and , but in no other , and . Then, we reduce the WFOMC for to that of . First, for each relational symbol in , if appears in the cycle then we define and , otherwise . Second, for all variables that appear in the cycle we set their domain size to be the same as that of the corresponding variable in , otherwise we set . Then and have the same WFOMC.

Finally, we discuss a peculiar sentence, whose complexity we left open in [18]:

Theorem 3.3.

The data complexity of the symmetric WFOMC problem is in PTIME for the query

In [18] we showed that is in PTIME under the modified semantics, where is a bipartite graph. This implies that the range of the variables is disjoint from the range of the variables . Now we extended the proof to the standard semantics used in this paper. What makes this query interesting is that the algorithm used to compute it requires a subtle use of dynamic programming, and none of the existing lifted inference rules in the literature are sufficient to compute this query. This suggests that we do not yet have a candidate for a complete set of lifted inference rules for the symmetric WFOMC.

3.3 Proofs

Proof of Theorem 3.1

We briefly recall the basic notions from Valiant’s original papers [33, 34]. A

counting Turing machine

is a nondeterministic TM with a read-only input tape and a work tape, that (magically) prints in binary, on a special output tape, the number of its accepting computations. The class #P consists of all functions computed by some counting TM with polynomial (non-deterministic) running time and a unary input alphabet. A function is #P-hard if, for any function in #P there exists a polynomial time, deterministic TM with access to an oracle for that computes . Notice that ’s input alphabet is unary. As usual, is called #P-complete if it is both hard, and in #P.

Our proof of Theorem 3.1 consists of two steps. First we construct a #P-complete function , which is computable by a linear time counting TM , which we call a universal #P machine; in fact, we will define by describing . A similar construction in [34] is sketched too briefly to see how the particular pairing function can work; we use a different pairing function and give full details. To prove FO membership, we also need to ensure runs in (nondeterministic) linear time, which requires some care given that the input is given in unary. Once we have defined , the second step of the proof is a standard construction of an FO formula to simulate : we follow Libkin [28, p. 167], but make several changes to ensure that the formula is in FO. The two steps are:

Lemma .

There exists a counting TM, , with a unary input alphabet, such that (i) runs in linear time, and (ii) the function that it computes is #P-hard.

It follows immediately that is #P-complete.

Lemma .

Let be any counting TM with a unary input alphabet computing some function . Suppose runs in time . Then there exists an FO formula over some relational vocabulary , s.t. , where for .

Theorem 3.1 follows by applying this lemma to , hence and the formula is in FO. By allowing runtimes with , the lemma implies: ; this is an extension of the classic result by Jones and Selman [23], which, restated for the tally notation says (see [12], [9, Sec.5]). By considering over unlabeled structures, denoted , the correspondence becomes even stronger. In , all models that are identical up to a permutation of the constants are counted once, and .

Proof of Lemma 3.3.

The idea for is simple: its input is represented in unary and encodes two numbers : , for some encoding function to be defined below. first computes from , then simulates the th #P counting TM on input . The difficult part is to ensure that runs in linear time: every TM that it simulates runs in time for some exponent that depends on , and thus if we construct naively to simply simulate machine on input , then its runtime is no longer polynomial.

We start by describing an enumeration of counting TMs in #P, , with the property that runs in time on an input . We start by listing all counting TMs over a unary input alphabet in standard order . Then we dove-tail pairs of the form where is an index in the standard TM order and is a number. represents the counting TM that simulates on input with a timer for steps. The machine can be constructed with at most quadratic slowdown over (due to the need to increment the counter). We further ensure that dovetailing is done such that ; for that, it suffices to advance in such a way that advances at least as fast as , that is, . It follows that, for every , the runtime of on input is . It remains to show that the list enumerates precisely all #P functions. Indeed, each function in this list is in #P, because the runtime of is polynomial in the input . Conversely, every function in #P is computed by some in our list, because it is computed by some whose runtime on input is and this is if we choose . This completes the construction of the enumeration .

We describe now the counting machine . Its input is a number in unary, which represents an encoding of two integers . We will choose the encoding function below such that it satisfies three properties: (a) can compute from in linear time (with auxiliary tapes), (b) , and (c) for every fixed , the function can be computed in PTIME. We first prove the lemma, assuming that satisfies these three properties.

The counting machine starts by computing a binary representation of its unary input on its work tape: this step takes linear time in . Next, it extracts in linear time in (by property (a)), then it simulates on input . The runtime of the last step is (by property (b)), hence runs in linear time in the input . It remains to prove that the function computed by is #P-hard. Consider any function in #P: we will describe a polynomial-time, deterministic Turing machine with an oracle for that computes . Since is in #P there exists such that is computed by . On input , computes in PTIME (by property (c)), stores it on the oracle tape, then invokes and obtains the result .

It remains to describe the encoding function . We take . To prove (a), note that is obtained by counting the trailing zeroes in the binary representation of , is obtained by first computing a ternary representation of , ignoring trailing zeros and deriving from . (b) follows through direct calculations, using the fact that . (c) is straightforward. ∎

Proof of Lemma 3.3.

We describe here the most important steps of the proof, and delegate the details to Appendix B. We will consider only the case , i.e. the counting TM runs in linear time: the case when is handled using a standard technique that encodes time stamps using a relation of arity . We briefly review Trakhtenbrot’s proof from Libkin [28, p. 167]: for every deterministic TM, there is a procedure that generates a formula such that TM has an accepting computation starting with an empty input tape iff is satisfiable. The signature for