It is generally well known that the areas of mathematics sometimes meet at unexpected places, leading to problems where tools from different areas can be used. Yet it is always exciting to witness this happening, as it both validates the common goals and justifies diverse techniques in their pursuit.
This paper arose at the meeting of two areas: mathematical logic (specifically, decidability of theories) and discrete geometry (integer points in polytopes). In recent years, there has been much effort by the authors, pursuing somewhat different but related goals, to understand when variations on Presburger Arithmetic (PA) become intractable. The meaning of “intractable” is traditionally different in the two ares, of course. In logic it is qualitative and stands for decidable, while in computational geometry and integer optimization it is quantitative and stands for computationally hard (as in NP-hard, PSPACE-hard, etc.)
In this paper we bridge the gap between the areas by asking questions of common interest. In fact, we get very close to completely filling the gap (see 8.1). As the reader shall see, we use similar number theoretic tools (Ostrowski representation of integers), as well as computational ideas.
Part of the problem in presenting the results is addressing both audiences, so we structure the paper a little differently. We first state the results and then give concise yet lengthy backgrounds in both area separately, emphasizing advances from both directions.
1.2. Main results
Let be a fixed irrational number. The reader can always assume that is algebraic, although some results in the paper hold in full generality.
Let . This is a first order theory over the reals, with a predicate for the integers, which allows addition and scalar multiplication by . This is an extension of Presburger Arithmetic. It is still decidable when is quadratic [H2], but undecidable otherwise [HTy] (see below).
An integer sentence in , is a sentence whose quantified variables are constrained to integer values. Such sentences have the form:
where are alternating quantifiers, and is a Boolean combination of linear inequalities in with coefficients and constant terms in . As the number of alternating quantifier blocks and the dimensions increase, such sentences become harder to decide, and determining exactly how hard is an important problem in computational complexity.
Sentences in (1.1) have a nice geometric interpretation in many special cases. When the Boolean formula is a conjunction of linear equations and inequalities, it gives a convex polyhedron defined over . For and , the sentence asks for existence of an integer point in :
In a special case of , and , the sentence asks whether projections of integer points in a convex polyhedron cover all integer points in another polyhedron :
Here both and are defined over . Further variations on the theme and increasing number of quantifiers allow most general formulas with integer valuations of the polytope algebra (see e.g. [Bar]).
We think of as being given by its defining polynomial of degree , with a rational interval to single out a unique root. We say that is quadratic if . Similarly, the elements are represented in the form , where . For example, is quadratic, is given by , so that .
For , the encoding length is the total bit length of ’s defined above. Similarly, the encoding length is defined to be the total bit length of all symbols in , with integer coefficients and constants represented in binary. In the following results, the constants vary from one context to another.
Let be a quadratic irrational number, and let . An integer sentence in with alternating quantifier blocks can be decided in time at most
where the constants depend only on .
In the opposite direction, we have the following lower bound:
Let be a quadratic irrational number, and let . Then, deciding integer sentences in with alternating quantifier blocks and at most variables and inequalities requires space at least:
where the constants only depend on .
These results should be compared with the triply exponential upper bound and doubly exponential lower bounds for PA (discussed below). The borderline case of , the problem is especially interesting. We give the following lower bound, which only need a few variables:
Let be a quadratic irrational number. Then, deciding integer sentences in with at most inequalities is PSPACE-hard, where the constant depends only on . Furthermore, for , one can take .
This should be compared with Grädel’s theorem on -completeness for integer sentences in PA [Grä] (also discussed below).
On the other hand, for non-quadratic irrational numbers, we have:
Let be a non-quadratic irrational number. Then integer sentences in are undecidable, where .
1.3. Complexity background
Presburger Arithmetic PA is the decidable first order theory of , first introduced by Presburger in [Pre] and extensively studied by Skolem and others. A quantifier elimination algorithm for PA was given by Cooper [Coo] to effectively solve the decision problem of PA. General PA sentences have no bounds on the numbers of quantifiers, variables and Boolean operations. Oppen [Opp] showed that such sentences can be decided in at most triply exponential time (see also [RL]). In the opposite direction, a nondeterministic doubly exponential lower bound was obtained by Fischer and Rabin [FR] (see also [W1]) for deciding general PA sentences. As one restricts the number of alternations, the complexity of PA drops down by roughly one exponent (see [Für, Sca, RL]), but still remains exponential.
For a bounded number of variables, two important cases are known to be polynomial time decidable, namely the analogues of (1.2) and (1.3) with rational polyhedra and . These are classical results by Lenstra [Len] and Kannan [Kan], respectively. Scarpellini [Sca] showed that all -sentences are still polynomial time decidable for every fixed. However, for two alternating quantifiers, Schöning proved in [Sch] that deciding is NP-complete. Here any Boolean combination of linear inequalities in two variables, instead of those in the particular form (1.3). This improved on an earlier result by Grädel in [Grä], who also showed that PA sentences with alternating quantifier blocks and variables are complete for the -th level in the Polynomial Hierarchy PH. In these results, the number of inequalities (atoms) in is still part of the input, i.e., allowed to vary.
Much of the recent work concerns the most restricted PA sentences:
for which the number of alternations (), number of variables and number of inequalities in are all fixed. Thus, the input of (1.4) is essentially a bounded list of integer coefficients and constants in , encoded in binary. For , such sentences are polynomial time decidable by [Woo]. For , Nguyen and Pak [NP] showed that deciding PA-sentences with at most inequalities is NP-complete. More generally, they showed that such sentences with alternations, variables and inequalities are complete for the -th level in PH. Thus, limiting the “format” of a PA formula does not reduce the complexity by a lot. This is our main motivation for the lower bounds in theorems 1.3 and 1.2 for .
We emphasize that the sudden jump from polynomial hierarchy in PA to super-exponential complexity in is due to the power of irrational quadratics. Specifically, any irrational quadratic has an infinite periodic continued fraction. From here, we can work with Ostrowski representations of integers in base
, and code string relations such as shifts, suffix/prefix and subset, which were not all possible in PA. Such operations are rich enough to encode arbitrary automata computation, and in fact Turing Machine computation in bounded space.
Finally, in [KP], Khachiyan and Porkolab prove that for a bounded number of variables, one can decide in polynomial time if a convex semialgebraic set contains an integer point (see Theorem 8.3). In particular, for linear equations and inequalities, this implies the Integer Programming with algebraic coefficients:
Theorem 1.5 ([Kp]).
Let be the field of algebraic numbers. For every fixed , sentences of the form with can be decided in polynomial time.
Note that the system in the theorem can involve arbitrary algebraic irrationals. This is a rare positive result on irrational polyhedra. In fact, for a non-quadratic , this gives the only positive result on that we know of (cf. 8.2).
1.4. Decidability background
It has long been known that the theory of , equivalently the theory of for rational , is decidable (arguably due to Skolem [Sko] and later rediscovered independently by Weispfenning [W2] and Miller [Mil]). However, the decidability of the theory of for irrational was determined only recently.
Hieronymi and Tychonievich showed in [HTy] that if an expansion of can define a discrete set and also satisfies a certain reasonable denseness condition, then it can actually define every subset of for every . As an application, they proved the following result:
Theorem 1.6 ([HTy]).
For any that are -linearly independent, the structure defines multiplication, and thus its theory is undecidable.
Since are -linearly independent for a non-quadratic , the theory of is undecidable for such . Indeed, a careful analysis of their work shows that this result can be further specialized to give undecidability of integer sentences in :
The main contrast between Theorem 1.5 and 1.7 is that the former only considers -sentences. Neither Corollary 1.7 nor an upper bound on in (1.1) needed for undecidability was stated explicitly in [HTy], but both can be obtained by careful analysis of the proof. In Theorem 1.4, we not only give a proof of Corollary 1.7, but also explicitly quantify this result by showing that alternating quantifier blocks are enough for undecidability. While our argument is based on the ideas in [HTy], substantial extra work is necessary to reduce the number of alternations to from the upper bound implicit in the proof of Theorem 1.6.
When is quadratic, Hieronymi proved the following surprising result:
For quadratic, integer sentences (1.1) of are decidable. More generally, the structure defines a model of Monadic Second Order Logic (MSO), and vice versa.
By this result for quadratic, to decide integer sentences (1.1), one can translate them into corresponding sentences in MSO and then decide the latter. Thus, upper and lower complexity bounds for decision in MSO can theoretically be transferred to . However, an efficient direct translation between and MSO was not described in [H1, H2]. Ideally, one would like to translate a sentence from to MSO, and vice versa, with as few extra alternations as possible. In theorems 1.1 and 1.2, we explicitly quantify this translation.
1.5. Proofs outline
The most powerful feature of is that we can talk about Ostrowski representation of integers, which will be used as the main encoding tool. We first obtain the upper bound in Theorem 1.1 by directly translating (1.1) into the language of automata using Ostrowski encoding. Next, we show the lower bound for alternating quantifiers (Theorem 1.3) by a general argument on the Halting Problem with polynomial space constraint, again using Ostrowski encoding.
We generalize the above argument to get lower bound for any alternating quantifier blocks (Theorem 1.2). This is done by first translating sentences from the weak Second Order Monadic logic (WMSO) to sentences with only one extra alternation, and then invoke a known tower lower bound for WMSO. Overall, the paper make transitions between , finite automata and WMSO, all of which are different incarnations of the same logic theory.
Finally in the proof of Theorem 1.4, we can again use the expressibility of Ostrowski representation to reduce the upper bound of the number of alternating quantifier blocks needed for undecidability in for non-quadratic . The use of Ostrowski representations allows us to replace more general arguments from [HTy] by explicit computations, and thereby reduce the quantifier-complexity of certain integer sentences in .
Ostrowski representation and continued fractions play a crucial role throughout the paper, and are first introduced in Section 3. We use the following notation:
denotes the set of convergents with non-zero coefficients in the Ostrowski representation of .
We write if is a convergent with a non-zero coefficient in the Ostrowski representation of .
3.1. Continued fractions and Ostrowski representation
Let be any irrational, with . The convergents of follow the recurrence relation:
This can be written as:
where . Let . They have the properties:
Each has a unique -Ostrowski representation:
where , and whenever .
See [RS, Ch. II-§4]. ∎
From now on, when and are clear from the context, we refer to (3.6) simply as the Ostrowski representation of . We also denote the coefficient by . Denote by the set of with .
We set , so that . Let . Define to be the function that maps to , where is the unique natural number such that . In other words:
Define , so that .
Let . We have:
where the coefficients are from (3.6). Also is a dense subset of the interval .
3.2. Periodic continued fractions
An irrational is a quadratic if and only if it has a periodic continued fraction . Let . It is clear that for some . Therefore, sentences in the theory can be expressed in and vice versa. Thus, for our complexity purposes, we can always assume that our quadratic irrational is purely periodic, i.e.,
with the minimum period .
Let . There exist such that for every with , we have:
The coefficients can be computed in time .
There are fixed such that
for every .
Again from (3.2), for every :
Since , we have:
Note that are constants. From here we easily get and . ∎
3.3. Logical formulas for working with Ostrowski representation
Let be any irrational, not just quadratic. The convergents can be characterized by the best approximation property. Namely, with is a convergent if and only if
From this, we have and if and only if they satisfy
Note that is a -formula. More generally, consider the formula:
Then is true if and only if for some with , i.e., consecutive convergents of .
Hereafter, we assume , i.e., and for some .
Define the following quantifier free relations:
if and only if holds for some .
and if and only if holds for some .
Also is uniquely determined by if After or holds.
(Similar to lemmas 4.6, 4.7 and 4.8 in [H2])
is odd. If, then its Ostrowski representation is for some . From Fact 3.2, we have . By (3.3), we have have if is odd and if is even. Combined with , we have:
By (3.5), this can be written as . By (3.7), we have , where is unique such that . Also note that and . So the above inequalities can be written as . When is even, the inequalities reverse to . Thus implies . The converse direction can be proved similarly, using (3.4) and (3.5).
ii) The only difference here is that can be at most . Details are left to the reader. ∎
The relation , meaning that appears in , is -definable:
and also -definable:
To see this, note that if and only if for some with and .
We will need one more quantifier-free formula:
This is satisfied if and only if
If , then in is strictly less than (by ).
In other words, Compatible is satisfied if and only if and can be directly concatenated at the point to form (see (3.6)).
4. Quadratic irrationals: Upper bound
In this section we prove Theorem 1.1. It should be emphasized that the tower height in Theorem 1.1 only depends on the number of alternating quantifiers, but not on the number of variables in the sentence . First, we consider the case of a quantifier free formula.
Let be a quantifier free (integer) formula in , i.e., a Boolean combination of linear inequalities in with coefficients/constants in . Then there is an automaton of size recognizing the set of solutions of . The constant only depends on .
Each variable in takes value over , but can be replaced by for two variables . So we can assume that all variables take values over . Recall that coefficients/constants in are given in the form with . So now each inequality in can be reorganized into the form:
Here are tuples coefficients in , and are subtuples of . Now, for each homogeneous term , we add in an additional variable and replace each appearance of in the inequalities by . By doing so, we introduce extra variables, but still keep the length linear. Now our formula splits into two parts. The first part consists of integer linear equalities:
The second part consists of inequalities of the form:
We encode integer variables by their Ostrowski representations, and build an automaton that recognizes the solutions of . In other words, each is encoded by the string , where the ’s are from (3.6). Here only a finite number of ’s are nonzero, so is a finite string. Since ’s are periodic (3.9) and , we are working with a finite alphabet.
First, by the result in [HTe], integer addition in Ostrowski representation is recognizable by a finite automaton. In other words, the function is regular. Now we rewrite each equality into single additions, using the doubling trick. For example, the equality is equivalent to the following system:
Again, we are introducing additional variables while keeping linear. Each single addition is recognizable by a finite automaton. Taking product of all such automata, one for each addition, we get a single automaton of size that recognizes the first part . Here is some constant dependent on .
Now we build an automaton for each inequality , and later take their product automaton. Recall and from (3.7) and Fact 3.2. We have for every . Here and always lies in the unit length interval . For , we have if and only if:
So the proof is done if we can show that for input :
The relation is recognizable by a finite automaton.
The relation is recognizable by a finite automaton.
The function is recognizable by a finite automaton.
Tasks i) and ii) are straightforward from basic properties of Ostrowski representation. We have if and only if is lexicographically smaller than when read from right to left. Also if and and is the smallest index where , then:
(see [H2, Fact 2.13]). We have iii) left to show. ∎
The function is recognizable by a finite automaton with Ostrowski encoding.
Proof of Lemma 4.2.
For with Ostrowski representation we define:
In other words, if then . So is clearly recognizable by a finite automaton. By Fact 3.2:
Since is a linear combination of and and linear equations are regular ([HTe]), we have an automaton for .222By clearing denominators in and building automata for single additions.
Proof of Theorem 1.1.
Given the sentence (1.1), by negation, we can assume . First, we build an automaton of size to recognize the quantifier free part .333Actuall, we first need to make so that additional variables in the proof of Lemma 4.1 can be inserted after . After that we make . Apply negations whenever necessary. Then we apply the power set construction (see e.g. [HUM, §2.3.5]) to eliminate . This blows up the size of by at most exponentiations. Thus, the resulting automaton has size at most a tower of height in . Now we still have the outer quantifier remaining, i.e., we still need to decide if has a solution. This is doable by a simple reachability argument, which runs in linear time relative to the size of . ∎
5. Quadratic irrationals: Pspace-hardness
In this section we prove Theorem 1.3. We will first show the lower bound for a general quadratic irrational (Theorem 5.1), and then specialize to (Corollary 5.3). By a short sentence, we mean one with an integer sentence in with a bounded number of variables, quantifiers and atoms (inequalities).
Let be a fixed quadratic irrational and . Then deciding short sentences in the theory is PSPACE-hard.
The most important property for any quadratic irrational is the periodicity of its continued fraction. Before proving Theorem 5.1, we construct in 5.1 some explicit formulas in to deal with the Ostrowski representation of an integer, in this case exploiting the periodicity of . Then we recall the definitions of Turing machine computations in Subsection 5.2. The proof of Theorem 5.1 is in Subsection 5.3, which translates Turing machine computations into Ostrowski representations of integers. An explicit bound on the number of variables and inequalities for the constructed short sentences are given in 5.4, where we also treat the case .
5.1. Ostrowski representation for quadratic irrationals
We only need to consider a purely periodic with minimum period (see Section 3.2). Let .
Also by Remark 3.4, we have