1 Introduction
During the last forty years logics over finite structures have become a central pillar for studying the definability and complexity of computational problems. The focus is on understanding how the expressive power of logics over finite structures, or equivalently query languages over relational databases, relate to natural classes of computational complexity. The foundational result in this line of work is Fagin’s famous theorem [9] which states that the existential fragment of secondorder logic over finite relational structures captures all decision problems that are accepted by a nondeterministic Turing machine in polynomial time –in other words: captures the complexity class . This was extended by Stockmeyer [25] to an exact correspondence between the quantifier prefix classes of secondorder logic and the levels of the polynomialtime hierarchy. Since then, most Turing complexity classes have been characterized in terms of the expressive power of logic languages (see e.g. the monographs by Immerman [16] and Libkin [19] or the collection [12]).
An advantage of secondorder logic is that it provides a natural and high level of expressive power. A simple example which illustrates this point is provided by the setcontainment join. A query that asks whether a patient has all the symptoms associated to a given disease, can be written literally provided the query language has the appropriate secondorder constructs. By contrast, in firstorder logic we cannot write this literally. We would need to say instead that for all symptom , if the patient has , then is also a symptom of the disease. Unfortunately, high expressiveness of secondorder logic also yields a high complexity of evaluation of formulae as shown by FaginStockmeyer theorems, which in principle make them not suitable for practical purposes. Nevertheless, secondorder logic has been used in applied areas such as Knowledge Representation [3]. In that area it is usually known as “model expansion for firstorder logic” and SAT solving is used to find the existentially quantified relations. On the other hand, the SAT solvers are usually “helped” by adding explicit syntax for fixedpoints, both least and greatest. Also nested fixedpoints (simultaneous induction) have been used for this purpose [15].
Aiming at a better understanding of which features of secondorder logic have a real impact on its expressive power and complexity, several semantic and syntactic restrictions have been considered in the literature. Among the syntactic restrictions, the results in [13] should be highlighted. The logics SOHorn and SOKrom obtained by restricting the secondorder logic to Horn and Krom formulae, respectively, both collapse to their respective existential fragments. Moreover, in finite structures that include the successor relation, they provide characterizations of deterministic polynomialtime and nondeterministic logspace, respectively. Also the tractability/intractability frontier of the model checking problem for prefix classes of existential secondorder logic has been completely delineated (see [7, 8, 11]).
Regarding semantic restrictions of secondorder logic, the logic introduced by A. Dawar in [5] and the related logic introduced in [14] are the source of inspiration for this paper. Both logics restrict the interpretation of secondorder quantifiers to relations closed under equivalence of types of the tuples in the given relational structure. In the case of , the secondorder quantification is restricted to relations closed under equivalence of types of the tuples, where is the restriction of firstorder logic to formulae with at most different variables. In the quantification is restricted to relations closed under equivalence of firstorder types of the tuples, i.e., under isomorphic types. It was proven in [5], among other results, that the expressive power of the existential fragment of is equivalent to the expressive power of the nondeterministic inflationary fixedpoint logic, and thus that is contained within the infinitary logic with finitely many variables . As shown in [14], is strictly more expressive than . In the absence of linear order many natural NPcomplete problems such as Hamiltonicity and clique are not expressible since they are already not expressible in (see [16] among other sources). On the other hand it is easy to see that there are NPcomplete problems that can be expressed in the existential fragment of , since this logic captures NP on ordered structures. Through the study of different semantic restrictions over binary NP, i.e., existential secondorder logic with secondorder quantification restricted to binary relations, many interesting results regarding the properties of the class of problems expressible in this logic were established [6]. Semantic restrictions were based mainly on secondorder quantification restricted to unary functions, order relations and graphs with degree bounds. Based on these restrictions they were able to prove the existence of a strict hierarchy of binary NP problems. Another relevant example of a semantic restriction over existential secondorder logic can be found in [18]. It was shown in that work that contextfree languages coincide with the class of those sets of strings that can be defined, on word models, by existential secondorder sentences in which the secondorder quantifiers range over a restricted class of binary relations called matchings.
So the question comes up: Are there additional semantic restrictions of secondorder logic that can result in elegant descriptive characterizations of meaningful computational complexity classes? It turns out that the approach of simply restricting secondorder quantification to range over relations of size at most polylogarithmic in the size of the structure, already leads to a positive answer. Indeed, using this approach we define a restricted secondorder logic, namely , and prove a Fagin’s style theorem showing that Boolean queries which can be expressed in the existential fragment of corresponds exactly to the class of decision problems that can be computed by a nondeterministic Turing machine with random access to the input in time for some , i.e., to the class of problems computable in nondeterministic polylogarithmic time ( for short). It should be noted that unlike Fagin’s theorem which proves that the existential fragment of secondorder logic captures NP over arbitrary finite structures, our result only holds over ordered finite structures, since is too weak as to define a total order of the domain. Nevertheless provides natural levels of expressibility within polylogarithmic space in a way which is closely related to how secondorder logic provides natural levels of expressibility within polynomial space. In fact, we show an exact correspondence between the expressive power of the quantifier prefix classes of and the levels of the nondeterministic polylogarithmic time hierarchy (polylogtime hierarchy from now on), analogous to the correspondence between the quantifier prefix classes of secondorder logic and the polynomialtime hierarchy.
This is up to our knowledge the first descriptive characterization of and each subsequent level of the polylogtime hierarchy. An anonymous referee of the preliminary conference version of the current paper [10], pointed us however to a very relevant antecedent in the work of David A. Mix Barrington in [21], where a semantically restricted secondorder logic (let us denote it as ) related to our logic , is used to characterize a class of families of constant depth quasipolynomial size AND/OR circuits . In particular it is shown there that the class of Boolean queries computable by uniform families of Boolean circuits of unbounded fanin, size and depth , coincides with the class of Boolean queries expressible in . While this would imply that also captures the whole polylogtime hierarchy (see Section 7 for a detailed explanation), in the case of this is an easy corollary of the onetoone correspondence between its quantifier prefix classes and the levels of the polylogtime hierarchy. As we show in Section 7, this correspondence is very unlikely to hold for the quantifier prefix classes of . It is also very unlikely that the existential fragment of can provide a descriptive characterization of , as it appears to be too powerful for that.
We further believe that the natural levels of expressive power provided by are not matched by . In this sense, we give examples of natural queries expressible in , such as the classes of satisfiable propositional formulas in disjunctive normal form and of propositional tautologies in conjunctive normal form, both defined in as early as 1971 ([4]). The definition of such queries in can be done by means of relatively simple and elegant formulae, despite a restriction we need to impose in the universal firstorder quantification. This is not fortuitous, but the consequence of the fact that in the definition of we use a more relaxed notion of secondorder quantification than that used in the definition of . Indeed, the secondorder quantifiers in range over arbitrary relations of polylog size on the number of elements of the domain, not just over relations defined on the set formed by the first elements of that domain as in . The descriptive complexity of is not increased by this more liberal definition of polylog restricted secondorder quantifiers.
We reach our results by following an inductive itinerary. After presenting some short but necessary preliminaries in Section 2, we introduce the logic in Section 3. We do this in a comprehensive way, giving examples of problems expressible in . The fragments and of formulae in quantifier prenex normal form are defined using the classical approach in secondorder logic, showing that every formula can be written in this normal form. This forms the basis for the definition of the hierarchy inside .
Section 4 shows how the first level of the hierarchy of quantifier prenex formulae of can already define the (polylogarithmically) bounded binary arithmetics necessary to prove our main result, i.e., to prove that the existential fragment of captures . We should stress that it is not immediately obvious that these operations can be expressed in , since this logic cannot express all existential secondorder properties over relations of polylogarithmic size due to its restricted universal firstorder quantification.
In Section 5 we concentrate on complexity classes inside polylogarithmic space. Analogous to the polynomial time hierarchy inside polynomial space we define a polylogtime hierarchy , where is defined by capturing all decision problems that can be accepted by a nondeterministic Turing machine in time for some , where is the size of the input. In order to be able to deal with the sublinear time constraint we assume random access to the input following the same approach than in [20]. Higher complexity classes (and ) in the hierarchy are defined in a similar way using alternating Turing machines with a bound on the alternations.
Section 6 contains our main results. First we give a detailed, constructive proof of the fact that the existential fragment of , i.e. , captures the complexity class . After that, we follow the inductive path and establish the expressive power of the fragments and , for every , proving that each layer is characterized by a randomaccess alternating Turing machine with polylog time and alternations. The fact that follows as a simple corollary.
2 Preliminaries
Unless otherwise stated, we work with ordered finite structures and assume that all vocabularies include the relation and constant symbols: , , , , , and . In every structure , is interpreted as a total ordering of the domain and is interpreted by the successor relation corresponding to the ordering. The constant symbols , and are in turn interpreted as the minimum, second and maximum elements under the ordering and the constant as . By passing to an isomorphic copy, we assume that is the set of natural numbers less than , where is the cardinality of . Then is interpreted by the following binary relation:
In this paper, always refers to the binary logarithm of , i.e. . We write as a shorthand for and finally as such as . We assume that all structures have at least three elements. This results in a cleaner presentation, avoiding the trivial cases of structures with only one element which would satisfy and structures with only two elements which would unnecessarily complicate the definition of the bounded binary arithmetic operations in Section 4.
3 : A Restricted SecondOrder Logic
We define as the restricted secondorder logic obtained by extending existential firstorder logic with (1) universal and existential secondorder quantifiers that are restricted to range over relations of polylogarithmic size in the size of the structure, and (2) universal firstorder quantifiers that are restricted to range over the tuples of such polylogarithmic size relations.
Definition 1 (Syntax of ).
For every and , the language of extends the language of firstorder logic with countably many secondorder variables , of arity and exponent . The set of wellformed formulae (wff) of vocabulary is inductively defined as follows:

Every wff of vocabulary in the existential fragment of firstorder logic with equality is a wff.

If is a secondorder variable and are firstorder terms, then both and are wff’s.

If and are wff’s, then and are wff’s.

If is a wff, is a secondorder variable and is an tuple of firstorder variables, then is a wff.

If is a wff and is a firstorder variable, then is a wff.

If is a wff and is a secondorder variable, then both and are wff’s.
Note that the firstorder terms in these rules are either firstorder variables or constant symbols; we do not consider function symbols. Whenever the arity is clear from the context, we write instead of .
Definition 2 (Semantics of ).
Let be a structure where . A valuation over is any function val which assigns appropriate values to all first and secondorder variables and satisfies the following constraints:

If is a firstorder variable then .

If is a secondorder variable, then
As usual, we say that a valuation is equivalent to a valuation if for all variables other than .
extends the notion of satisfaction of firstorder logic, with the following rules:

iff .

iff .

iff there is a valuation which is equivalent to such that .

iff, for all valuations which are equivalent to , it holds that .
Remark 1.
The standard (unbounded) universal quantification of firstorder logic formulae of the form can be expressed in by formulae of the form . Thus, even though only allows a restricted form of universal firstorder quantification, it can nevertheless express every firstorder query. This is however not applicable to its existential fragment.
We denote by , where , the class of formulae of the form:
where is either or depending on whether odd or even, respectively, and is an formula free of secondorder quantifiers. Analogously, we denote by the class of formulae of the form:
We say that an formula is in quantifier prefix normal form (QNF) if it belongs to either or for some .
Lemma 1.
For every formula , there is an equivalent formula that is in QNF.
Proof.
An easy induction using renaming of variables and equivalences such as and if is not free in , shows that each formula is logically equivalent to an formula in prenex normal form, i.e., to a formula where all first and secondorder quantifiers are grouped together at the front, forming alternating blocks of consecutive existential or universal quantifiers. Yet the problem is that first and secondorder quantifiers might be mixed. Among the quantifiers of a same block, though, it is clearly possible to commute them so as to get those of secondorder at the beginning of the block. But, we certainly cannot commute different quantifiers without altering the meaning of the formula. What we can do is to replace firstorder quantifiers by secondorder quantifiers so that all quantifiers at the beginning of the formula are of secondorder, and they are then eventually followed by firstorder quantifiers. This can be done using the following equivalences:
∎
Next we present examples of problems which are definable in . We start with a simple but useful example, and then move to examples which give a better idea of the actual expressive power of .
Example 1.
Let and be variables of the form and . The following formula, denoted as , expresses that the cardinality of (the relation assigned by the current valuation of) is less than or equal to that of .
,
where is an variable of arity and exponent . In turn can be defined as .
It is not difficult to see that existential can naturally define what we could call polylogarithmically bounded versions of NP complete problems.
Example 2.
Let be an node undirected graph. The following sentence expresses a polylogarithmically bounded version of the clique NPcomplete problem. It holds iff contains a clique of size .
Here and are secondorder variables of arity and exponent and holds iff is interpreted with the ary relation . Clearly can be defined in existential as shown in (3) in our next section. Other bounded versions of classical Boolean NPcomplete problems that are easily expressible in are for instance to decide whether has an induced subgraph of size that is colourable, or whether a has an induced subgraph which is isomorphic to another given graph of at most polylog size w.r.t. the size of .
We conclude this section with an example of a sentence which expresses the standard version of DNFSAT.
Example 3.
Let DNFSAT denote the class of satisfiable propositional formulas in disjunctive normal form.
In the standard encoding of DNF formulae as word models of alphabet , DNFSAT is decidable in [4]. In this encoding,
the input formula is a disjunction of arbitrarily many clauses enclosed in pairs of matching parenthesis. Each clause is the conjunction of an arbitrary number of literals. Each literal is a variable of the form , where the subindex , possibly preceded by a negation symbol.
Obviously, the complement NODNFSAT of DNFSAT is also in P. In we can define NODNFSAT by means of a sentence stating that for every clause there is a pair of complementary literals. Every clause is logically defined by a pair of matching parentheses such that there is no parenthesis in between. A pair of complementary literals is defined by a bijection (of size ) between the subindexes of two literals, which preserves the bit values and such that exactly one of the literals is negated. The following sentence expresses this formally.
It is fairly easy to see that this formula can be translated into an equivalent formula in . Note that the (unbounded) firstorder universal quantifiers in the first line can be replaced by quantifiers of secondorder with variables of exponent as per Remark 1. The exponent of is as well.
Similarly, DNFSAT can be defined in by a sentence stating that there is a clause that does not have a pair of complementary literals.
4 Bounded Binary Arithmetic Operations in
We define formulae that describe the basic (bounded) arithmetic operations of sum, multiplication, division and modulo among binary positive integers between and for some fixed . These formulae are later needed for proving our main result regarding the expressive power of the existential fragment of .
Note that it is not immediately obvious that these operations can indeed be expressed in , since this logic cannot express all existential secondorder properties over relations of polylogarithmic size due to its restricted universal firstorder quantification.
In our approach, binary numbers between and are represented by means of () relations.
Definition 3.
Let be a binary number, where and are the least and most significant bits of , respectively, and . Let . The relation encodes the binary number if the following holds: iff is the th tuple in the lexicographical order of , if , and if .
Note that the size of is exactly , and thus is a valid valuation for variables of the form . The numerical order relation among tuples can be defined as follows:
(1) 
In our approach, we need a successor relation among the tuples in , where is the set of integers between and (cf. Definition 3).
It is useful to define an auxiliary predicate , where is a secondorder variable of arity and exponent , such that if . Please, note that we abuse the notation, writing for instance instead of . Such abuses of notation should nevertheless be clear from the context.
(3) 
The formula , where is a secondorder variable of arity and exponent , expresses that encodes (as per Definition 3) a binary number between and and can be written as follows.
However, since is of exponent , the semantics of determines that the number of tuples in any valid valuation of is always bounded by . Thus can also be expressed by the following equivalent, simpler formula.
(4) 
In the following, denotes the subformula of .
The comparison relations and ( is strictly smaller than ) among binary numbers encoded as secondorder relations are defined as follows:
(5) 
where .
(6) 
where .
Sometimes we need to determine if the binary number encoded in (the current valuation of) a secondorder variable of arity and exponent corresponds to the binary representation of an individual from the domain. The following formula holds whenever that is the case.
(7) 
We use to denote the subformula of .
We now proceed to define formulae that describe basic (bounded) arithmetic operations among binary numbers. We start with , where , and are freevariables of arity and exponent . This formula holds if (the current valuation of) , and represent binary numbers between and , and . The secondorder variables and in the formula are of arity and , respectively, and both have exponent . We use the traditional carry method, bookkeeping the carried digits in .
(8) 
where holds if the value of the least significant bit of is consistent with the sum of the least significant bits of and . Formula holds if the value of the bit in position of (i.e., the value of the carried bit) is consistent with the sum of the values of the bits in the position preceding of and . Finally, holds if the value of the bit in position of is consistent with the sum of the corresponding bit values of and . The actual subformulae , and can be written respectively as follows:
For the operation of (bounded) multiplication of binary numbers, we define a formula , where , and are freevariables of arity and exponent . This formula holds if (the current valuations of) , and represent binary numbers between and , and .
The strategy to express the multiplication consists on keeping track of the (partial) sums of the partial products by means of a relation of size (recall that ). We take to be the multiplicand and to be the multiplier. Let be the th tuple in the numerical order of , let denote the restriction of to those tuples starting with , i.e., , and let denote the immediate predecessor of in the numerical order of , then the following holds:

If and , then encodes the binary number .

If and , then .

If and , then .

If and , then (the binary number encoded by) results from adding to the bits arithmetic leftshift of .
holds if for . Following this strategy, we can write as follows.