DeepAI
Log In Sign Up

Lecture notes on complexity of quantifier elimination over the reals

These are lecture notes for a course I gave in mid-1990s for MSc students at the University of Bath. It presents an algorithm with singly exponential complexity for the existential theory of the reals, in the spirit of J. Renegar. The aim was to convey the main underlying ideas, so many of the proofs and finer details of algorithms are either missing or just sketched. I changed nothing in the original notes except adding references, bibliography, and correcting obvious typos.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

01/10/2020

Notes on Theory of Distributed Systems

Notes for the Yale course CPSC 465/565 Theory of Distributed Systems....
05/10/2021

Lecture notes on descriptional complexity and randomness

A didactical survey of the foundations of Algorithmic Information Theory...
04/20/2010

Publishing Math Lecture Notes as Linked Data

We mark up a corpus of LaTeX lecture notes semantically and expose them ...
04/26/2020

Notes on Icebreaker

Icebreaker [1] is new research from MSR that is able to achieve state of...
01/13/2022

Beyond chord vocabularies: Exploiting pitch-relationships in a chord estimation metric

Chord estimation metrics treat chord labels as independent of one anothe...
01/02/2023

Algorithms for Massive Data – Lecture Notes

These are the lecture notes for the course CM0622 - Algorithms for Massi...
11/23/2021

Metodi Iterativi per Sistemi Lineari

These are the lecture notes (in Italian) of a course held in Perugia, It...

Introduction

These are lecture notes for a course I gave in mid-1990s for MSc students at the University of Bath. It presents an algorithm with singly exponential complexity for the existential theory of the reals, in the spirit of J. Renegar [7]. Some ideas are borrowed from [6] and [5]. The aim was to convey the main underlying ideas, so many of the proofs and finer details of algorithms are either missing or just sketched. Full proofs and algorithms can be found in the aforementioned papers or, in a somewhat different form, in the fundamental monograph [1].

I changed nothing in the original notes except adding references, bibliography, and correcting obvious typos.

1. Semialgebraic sets

1.1. Objectives of the course

We are going to study algorithms for solving systems of polynomial equations and inequalities. The emphasis is on the complexity, that is, on the running time of algorithms. Elementary procedures, like Gröbner basis or cylindrical decomposition, have high complexity. Effective algorithms require more advanced mathematical tools.

Computational problems we will be dealing with go beyond just solving systems. The most general problem we are going to consider is quantifier elimination in the first order theory of the reals.

1.2. Formulas and semialgebraic sets

Let denote the field of all real numbers.

Suppose that we are interested in deciding whether or not a system of inequalities

is consistent, that is, has a solution in . Here are polynomials with real coefficients in variables .

We can express the statement that the system is consistent by writing

(1.1)

This statement is either true (if there is a real solution to the system) or false.

We can generalize formula (1.1) in two natural ways. Firstly, we can consider formulae in which not all the variables are bound by quantifier .

Example 1.1.
(1.2)

for a certain .

In (1.2) variables

are called bound, while the remaining variables

are free.

Secondly, we can consider formulae with both types of quantifiers, as well as .

Example 1.2.

A formula with the prefix

has free variables and both types of quantifiers. Formulae of such kind are called formulae of the first order theory of (or of the reals).

Formulae of the first order theory of the reals can be used to express a great variety of statements, properties, problems about real numbers, and therefore about the geometry of .

Formulae without free variables express statements (true or false), while the ones with free variables describe properties.

A set of points , which after substituting their coordinates instead of free variables in a formula give a true statement, is called semialgebraic.

1.3. Tarski-Seidenberg Theorem

The most important fact about real formulae and semialgebraic sets is the following

Theorem 1.3 (Tarski-Seidenberg).

Any semialgebraic set can be described as a union of sets, each of which is a set of all solutions of a system of polynomial inequalities.

In other words, a set defined by a formula with quantifiers can be defined by a formula without quantifiers. The process of passing from one representation to another is called quantifier elimination.

Remark 1.4.

All the above definitions can be modified to suit the field of complex numbers rather than the reals. Inequalities should then be replaced by . Subsets of defined by formulae with quantifiers are called “constructible”. Analogy of the Tarski-Seidenberg Theorem is also true.

1.4. Formal input

The general problem we are concerned with in this course is quantifier elimination. The input for this problem is of the kind

Here:

  • are quantifiers or ;

  • — bound variables;

  • — free variables;

  • — Boolean combination of “atomic” formulae of the kind or , and for every the degree .

Set .

2. Complexity

2.1. Complexity for systems of inequalities

Model of computation: Turing machine (or RAM — random access machine).

Consider the system of inequalities

(2.1)

with (as usual, stands for the ring of all integers), , maxima of bit lengths of coefficients of , .

We consider first the problem of deciding consistency of system (2.1).

There are some deep reasons to look for an upper complexity bounds of the kind

The first algorithm for the problem was proposed by A.Tarski in late 40s [8]. In his bound

can’t be estimated from above by any tower of exponents. Such bounds are called “non-elementary” and indicate that the corresponding algorithm is absolutely impractical.

In mid-1970s G. Collins [3] and, independently, H.R. Wüthrich [9] published an algorithm, called “Cylindrical Algebraic Decomposition”, for which . Note that the Gröbner basis algorithm for the similar problem over complex numbers has essentially the same complexity.

Our aim is to construct an algorithm with . Thus, the complexity bound we are going to get is .

2.2. Complexity for quantifier elimination

For quantifier elimination problem the upper complexity bound has the same form as for deciding consistency of systems:

For Cylindrical Algebraic Decomposition (CAD) the value of is again equal to .

Further in this course we are going to describe, with some detail, an algorithm with

for a certain constant .

2.3. “Real numbers” model

Along with the Turing machine (bit) model of computation, one can consider another model in which the elementary step is an arithmetic operation or comparison operation over real numbers. This model is more abstract in the sense that no constructive representation of real numbers is assumed, the latter are viewed merely as symbols. Real numbers model, therefore, can’t serve as a foundation of algorithm’s theory in classical sense, however a formal theory of computation based on this model exists (Blum-Shub-Smale theory).

Complexity results for real numbers model are usually stronger than similar results for bit model. One reason is that computations in bit model can take advantage of a clear connection between the size of the input and metric characteristics of definable semialgebraic sets, which allows sometimes to use rapidly converging methods of numerical analysis. Another technical advantage of bit model is a possibility to use effective polynomial factorization as a subroutine which is not always possible for polynomials with abstract coefficients.

On the weak side, real number model algorithm, being applied to polynomials with essentially constructive (e.g., integer) coefficients, can’t control the growth of intermediate data. Thus, it is theoretically possible that the bit complexity of an algorithm is significantly worse than it’s real number complexity. In practice, however this almost never happens.

Tarski’s algorithm works in real number model as well as for Turing machines. Collins’ CAD in original version can’t handle symbolic coefficients, but probably could be redesigned for real number model.

In this lecture course we follow the approach of James Renegar. Our algorithms will work equally well in both models.

The complexity of algorithms in real number model will be of the form with for deciding consistency problem and for quantifier elimination problem.

2.4. Lower bounds

Observe that the number of the monomials (terms of the kind ) in a polynomial in variables of the degree can be equal to

so the complexity for deciding consistency is quite close to the “best possible” (for dense representation of polynomials, i.e., if we agree to write zero coefficients). The best possible would be , but the existence of an algorithm with such complexity is a known open problem.

For quantifier elimination problem there is a result of Davenport and Heintz [4] proving that polynomials occuring in quantifier-free formula after quantifier elimination can have the degrees not less than . It follows that Renegar’s algorithm for quantifier elimination is “close to the best possible”.

3. Univariate case: Sturm’s theorem

3.1. Computational problems for univariate polynomials

The idea behind almost all methods for solving the problems under consideration, is to reduce the problems to the univariate case. All straightforward methods, like CAD, do that, our effective algorithms will use the same trick.

Strangely enough, we shall not consider the problem that seems the most natural here: of finding solutions of univariate equations or inequalities. This may mean producing a rational approximation with a given accuracy to a root. This problem belongs, in fact, to numerical analysis in the classical sense and is rather far from the methods of this course. However, it’s worth mentioning that such an approximation indeed can be produced in time polynomial in the length of the description of input polynomials and accuracy bound.

So, if we don’t “compute” roots (in the above sense) what is left algorithmically? Over (complex numbers) there is really not much to do since, according to Fundamental Theorem of Algebra, any polynomial has exactly roots, and if we want to “define” a root by an irreducible over rational numbers polynomial, we simply have to factorize . Also, if we have a system of equations we can decide its consistency by computing GCD (greatest common divisor) over rational numbers of .

In real case, already deciding consistency of is a nontrivial task.

3.2. Sturm’s Theorem: formulation

Let , an interval , the degree .

Let be the derivative of .

Set

;

;

— the reminder of division of by , taken with the opposite sign: , . Thus, .

Continue the process of producing polynomials while the division is possible (i.e., until the reminder becomes a constant). If the last (constant) reminder (in this case and are relatively prime), then divide all ’s by . The resulting sequence of polynomials is called Sturm’s sequence. If , then is the Sturm sequence.

For any let be the number of sign changes in the sequence of values of the Sturm’s sequence: Formally, is the number of pairs of indices in Sturm’s sequence such that , and if .

Theorem 3.1 (Sturm’s Theorem).

The number of distinct real roots of in is .

Proof.

Can be easily found in literature. ∎

3.3. Sturm’s Theorem: discussion

Sturm’s Theorem gives and algorithm for deciding whether has a real root. Indeed, from the formulation follows how to decide this for an interval . It’s well known that all real roots of a univariate polynomial belong to the interval with

So, it is enough to compute . If this difference is then there is a real root.

A simpler way is to pass from intervals to whole is to notice that for any polynomial the sign of (i.e., sign of at any sufficiently large point ) coincides with the sign of (or, equivalently, with the sign of at 1), while the sign of coincides with the sign of at .

Observe also that Sturm’s Theorem provides conditions in terms of polynomial equations and inequalities on coefficients of , which are true if and only if has a root in .

4. Univariate case: Ben-Or – Kozen – Reif algorithm

4.1. Towards generalization of Sturm’s Theorem

Our nearest goal is to generalize Sturm’s Theorem to the case of several univariate polynomials.

Let with degrees for all .

Definition 4.1.

Consistent sign assignment for is a string

where , such that the system has a root in .

We want to generalize Sturm’s Theorem and construct an algorithm for listing all consistent sign assignments. Moreover, we want to get polynomial conditions on coefficients, guaranteeing the existence of a common root.

The following algorithm belongs to M. Ben-Or, D. Kozen, and J. Reif [2].

4.2. Preparation of the input

First we “prepare” the input family so that it would be more simple.

(1) Make all ’s squarefree and relatively prime. It is sufficient to compute greatest common divisor (GCD) of and its derivative for all (with the help of Euclidean algorithm), and then to divide each by GCD of the family . Now we can reconstruct all consistent sign assignments of from all consistent sign assignments of .

Throughout this portion of notes we assume that all ’s are squarefree and relatively prime. It follows that all the sets of all roots of polynomials are disjoint, thus any consistent sign assignment for the family has at most one “=” sign.

(2) Add a new member in the family , so that any consistent sign assignment for has exactly one “=” sign, and each consistent sign assignment for can be obtained from a consistent sign assignment for . Let the product . Define

Then we can take

Now the problem can be reformulated as follows: list all consistent assignments of signs and for at roots of .

4.3. Cases and

For use Sturm’s sequence obtained by Euclidean algorithm.

Denote (recall that we start with Euclidean division of by ).

In case we have a family of two polynomials .

Denote:

For a finite set , let be the number of elements in . Obviously, .

The following trick is very important for generalization of Sturm’s Theorem. Construct Sturm’s sequence of remainders starting with dividing by (rather than simply by as in original Sturm’s sequence). Denote the difference of numbers of sign changes at and at by .

Lemma 4.2.

.

Proof.

Trivial modification of a proof of original Sturm’s Theorem. We don’t consider the proof in this course. ∎

Original Sturm and this Lemma imply that the following system of linear equations with the unknowns is true:

Denote the -matrix of this system by .

The value determines whether is consistent. The value determines whether is consistent. Hence, after the linear system is solved all consistent sign assignments will be listed.

4.4. Case

In this case we come to the following system of linear equations:

where

Denote the matrix of this system by .

Observe, that if this system of linear equations is true, then all consistent sign assignments for will be found.

Checking that the system is indeed true, is straightforward:

(1)  is the number of all roots of with , while is the number of all roots of with , so the left-hand part of the first equation is the whole number of roots of ;

(2)  is the number of all roots of with , while is the number of all roots of with , so the left-hand part of the second equation equals to the difference between these two quantities, which is by the Lemma 4.2;

(3) similar to (2);

(4)  is the number of all roots of with and having the same sign (i.e., ), while is the number of all roots of with and having distinct signs (i.e., ), so the left-hand part of the last equation is by the Lemma 4.2 (applied to instead of ).

4.5. Tensor product of matrices

Observe that , where is the operation of tensor product.

Definition 4.3.

Let and be two matrices. Then the tensor product of by is the -matrix

where

4.6. Case of arbitrary

Let us consider now the case of a family for arbitrary . Define the matrix ( times). Let

be the vector of all possible elements of the form

where ; and be the vector of all possible elements of the form for all . Herewith, the component of having “bared” sets corresponds to the component of .

Lemma 4.4.

Proof. Streightforward observation (similar to particular case of in Section 4.4).

Square matrix is nonsingular. Its unique solution describes all consistent sign assignments for . Solving the system by any of standard methods (e.g., Gaussian elimination) we list all consistent sign assignments. Let us estimate the complexity of this algorithm.

From definition of tensor product it follows that the order of is . Vectors and have obviously components each. Each component of the vector can be computed (in real numbers model) by Euclidean division in time polynomial in , and we need to find components. In bit model, supposing that the bit sizes of integer coefficients of polynomials are less than , vector can be computed with complexity polynomial in . The complexity of Gaussian algorithm is polynomial in the order of the matrix, i.e., in (in real numbers model) and (in bit model). Thus, the total running time of finding all consistent sign assignments via system is polynomial in (real numbers) or in (bit).

We want, however, to obtain an algorithm whose running time is polynomial in (real numbers) and in (bit).

4.7. Divide-and-conquer algorithm

A key observation to get a better algorithm is that since and , the number of roots of is less than , and therefore, the total number of all consistent sign assignments is less than .

It follows that among the components of the solution of the system there are less than nonzero ones. Thus, is equivalent to a much smaller (for large ) -linear system. Now the problem is how to construct that smaller system effectively.

We shall apply the usual divide-and-conquer recursion on (supposing for simplicity of notations that is even).

Suppose that we can do all the computation in the case of . Namely, assume that we already have the solutions for two problems for two families of polynomials:

This means, we have constructed:

(1) two square nonsingular matrices and with elements from , of orders and respectively, ;

(2) two vectors and of sizes and respectively with elements of the forms

and

respectively, such that the systems have solutions respectively with nonzero coordinates only of the forms

and

respectively;

(3) solutions and which correspond to all consistent sign assignments for and respectively.

Now let us construct a new linear system. As a matrix take , which is a square nonsingular -matrix.

Two vectors are combined in a new column vector consisting of column vectors placed end-to-end, each of length .

The th component of is

where is the th component of , is the th component of , and denotes the symmetric difference of two sets of polynomials:

Finally, two vectors and are combined into a new column vector .

Components of are indexed by pairs . The th component of is where is the th component of and is th component of .

Lemma 4.5.

.

We don’t consider a proof of Lemma 4.5 in the course.

Observe that describes all consistent sign assignments for . However, unlike vectors and , the vector may have zero components. As was already noted, the new system has the order . It may happen that . In this case indeed contains zero components, not less than .

By solving system, we find , and after that reduce the system . Namely, we delete all zero components of and the corresponding columns of to obtain a rectangular system of order at most . After that, choose a basis among the rows of the matrix and form a new -matrix. Delete all components of not corresponding to basis rows.

The resulting system is the output of the recursive step of the algorithm.

4.8. Complexity of Ben-Or–Kozen–Reif algorithm

It’s easy to see that the algorithm has the running time polynomial in (and , in bit model). Indeed, denoting the complexity function by , we get a functional inequality

for a certain polynomial , whose solution .

5. Systems of polynomial inequalities

5.1. Deciding consistency of systems of polynomial inequalities

We address the basic problem of deciding consistency of systems of inequalities

(5.1)

Here polynomials (for bit model) or (for real numbers model). A version of the algorithm presented below belongs to James Renegar.

It is convenient to deal with systems of nonstrict inequalities only. Introduce a new variable and replace strict inequalities in (5.1) by

Obviously, (5.1) is consistent if and only if the new system is consistent. Changing the notations, we shall consider, in what follows, the problem of consistency of a system of nonstrict inequalities

(5.2)

Because it is easy to check whether or not the origin is a solution of (5.2), we shall assume that the origin does not satisfy (5.2).

Finally, we suppose that the set is bounded, if nonempty. The nonbounded case can be reduced to a bounded by intersecting with a ball in of a sufficiently large radius, we are not considering the details of this construction in the course.

5.2. Towards reduction to a system of equations

Let the degrees for all , and be the minimal even integer not less than . Introduce a new variable , and consider the polynomial

Lemma 5.1.

If (5.2) is consistent, then there is a solution of (5.2) which is a limit of a root of

(5.3)

as .

Proof.

Let . Then for all sufficiently small positive values of

hence . Fix for a time being one of these values of .

Let be any connected component of such that . Note that at all points the sign of is the same for each , hence .

Observe that is bounded. Indeed, suppose is not bounded. Then, there exists a point at an arbitrarily large distance from . In particular, for such there is at least one , with , having a large distance . It follows from basic facts of analytic geometry (Łojasiewicz inequality), that for sufficiently larger than , we have , i.e., . Since is connected, there is a connected (semialgebraic) curve containing and . Because and , there exists a point with . We get , which contradicts .

At every point in the boundary of polynomial . It follows that attains a local maximum at a point . It is well known that any such point is a root of the system of equations

Also, since , we have

Passing to limit in the latter system of inequalities, we get

where as .

The lemma is proved. ∎

5.3. Homogeneous polynomials and projective spaces

Any polynomial is an expression of the form

A polynomial is called homogeneous polynomial or form if is a non-zero constant for all with .

Examples.

(i) Homogeneous polynomial: ;

(ii) Homogeneous polynomial: ;

(iii) Nonhomogeneous polynomial: .

Observe that zero is always a root of a homogeneous polynomial. Also, if is a root of a homogeneous polynomial , then for any the point is again a root of . For a fixed the set of all points of the form is a straight line passing through the origin. So, if is a root of , then the line passing through and the origin consists of roots of . Thus, we can say that the line itself is a root of .

This motivates the idea to consider (in homogeneous case) a space of straight lines, passing through the origin, instead of the usual space of points. It is called an -dimensional projective space and is denoted by . An element of , which is a line passing through , is denoted by (i.e., we use colon instead of comma). Of course, for any elements of :

coincide. The usual space of points is called affine if there is a possibility of a confusion.

Observe that our construction is leading to the projective space of the dimension , which is quite natural as can be seen from the example of . The latter is the space of all straight lines on the plane of (complex) coordinates , passing through the origin. All elements of the projective space , except one, correspond bijectively to all points of the line . Namely, if and , then , and the point of . The only element, for which this correspondence fails, is (for all this is the same element, called point at infinity). Thus, we have a parametrization of minus point at infinity by points of the line. This justifiers the definition of the dimension of as “one”.

An arbitrary polynomial can be naturally homogenized. Namely, let . Introduce a new variable and consider the homogeneous polynomial . If is a root of , then is a root of . On the other hand, if is a root of and , then is a root of . So, there is a bijective correspondence between roots of and roots of of with . However, can have a root of the form (then, of course, at least one of is different from 0). This root does not correspond to any root of (in the sense described above), it’s called “root at infinity”.

Example.

for any , is the only root at infinity of .

The described above relation between the roots of a polynomial and it’s homogenization can be extended to roots of systems of polynomial equations. A system can have no roots (i.e., be inconsistent), while its homogenization can have roots.

Example.

homogenization:

The first system is inconsistent, while the homogenized one has the unique root at infinity .

We conclude that the homogenized system has not less roots than the original system. This implies, for instance, that if the homogenized system has at most finite number of roots, then the original system has at most finite number of roots as well. We shall use this remark further on.

The reason for considering homogeneous equations in projective space instead of arbitrary ones in affine space, is that the projective case is much richer in useful properties.

5.4. -resultant

Consider a system of homogeneous polynomial equations in variables :

(5.4)

Let for each .

For some deep algebraic reasons, which we are not discussing in this course, the system (5.4) is always consistent in . It can have either finite or infinite number of roots.

Introduce new variables .

Lemma 5.2.

There exists a homogeneous polynomial called -resultant, such that (is identically zero) if and only if (5.4) has infinite number of roots. Moreover, if (and, thus (5.4) has finite number of roots), then it can be decomposed into linear factors:

where are all the roots of (5.4) in . The degree of is .

We are not considering the full proof of this important lemma in this course. At certain point our algorithm will need to compute the -resultant. At that point we shall give a sketch of the procedure together with some hints on a proof of Lemma 5.2.

5.5. System (5.3) has a finite number of roots

Lemma 5.3.

For all sufficiently small positive values of , if (5.2) is consistent, then (5.3) has a finite number of roots.

Proof.

We already know (Lemma 5.1) that if (5.2) is consistent, then (5.3) has at least one root. We also know (end of Section 5.3) that it’s sufficient to prove that the homogenization of (5.3) has finite number of roots in the projective space .

Let be the homogenization of . We shall now prove that

(5.5)

has finite number of roots in for all sufficiently small .

Homogeneous polynomial is of the form:

where is a homogeneous polynomial (w.r.t. variables ).

The degree of (w.r.t. ) is . Thus, for any fixed values of , we have as .

Observe that for the system (5.5) has the same set of roots as

(5.6)

The -resultant of (5.6) is either identically zero (as a polynomial in ) for all values of , or is identically zero for at most finite number of values of .

The first alternative does not occur, because for large positive values of , the system (5.6) tends to