# Counting Polynomial Roots in Isabelle/HOL: A Formal Proof of the Budan-Fourier Theorem

Many problems in computer algebra and numerical analysis can be reduced to counting or approximating the real roots of a polynomial within an interval. Existing verified root-counting procedures in major proof assistants are mainly based on the classical Sturm theorem, which only counts distinct roots. In this paper, we have strengthened the root-counting ability in Isabelle/HOL by first formally proving the Budan-Fourier theorem. Subsequently, based on Descartes' rule of signs and Taylor shift, we have provided a verified procedure to efficiently over-approximate the number of real roots within an interval, counting multiplicity. For counting multiple roots exactly, we have extended our previous formalisation of Sturm's theorem. Finally, we combine verified components in the developments above to improve our previous certified complex-root-counting procedures based on Cauchy indices. We believe those verified routines will be crucial for certifying programs and building tactics.

• 11 publications
• 22 publications
08/16/2022

### Sturm's Theorem with Endpoints

Sturm's Theorem is a fundamental 19th century result relating the number...
03/29/2019

### Corrigendum to "Counting Database Repairs that Satisfy Conjunctive Queries with Self-Joins"

The helping Lemma 7 in [Maslowski and Wijsen, ICDT, 2014] is false. The ...
09/23/2018

### A Revised and Verified Proof of the Scalable Commutativity Rule

This paper explains a flaw in the published proof of the Scalable Commut...
06/26/2019

### Counting Roots of a Polynomial in a Convex Compact Region by Means of Winding Number Calculation via Sampling

In this paper, we propose a novel efficient algorithm for calculating wi...
04/11/2018

### Evaluating Winding Numbers and Counting Complex Roots through Cauchy Indices in Isabelle/HOL

In complex analysis, the winding number measures the number of times a p...
10/19/2020

### On the Robustness of Winners: Counting Briberies in Elections

We study the parameterized complexity of counting variants of Swap- and ...
03/05/2018

### Natural Deduction and the Isabelle Proof Assistant

We describe our Natural Deduction Assistant (NaDeA) and the interfaces b...

## 1. Introduction

Counting the real and complex roots of a univariate polynomial has always been a fundamental task in computer algebra and numerical analysis. For example, given a routine that counts the number of real roots of a polynomial within an interval, we can compute each real root to arbitrary precision through bisection, in which sense we have solved the polynomial equation. Another example would be the Routh-Hurwitz stability criterion (Arnold, 1992, Section 23)(Marden, 1949, Chapter 9), where the stability of a linear system can be tested by deciding the number of complex roots of the characteristic polynomial within the left half-plane (left of the imaginary axis).

Numerous methods have been invented in the symbolic and numerical computing community to efficiently count (or test) real and complex roots of a polynomial (Yap and Sagraloff, 2011; Wilf, 1978; Collins and Krandick, 1992). However, in the theorem proving community, where procedures are usually formally verified in a foundational proof assistant (e.g., Coq, Isabelle, HOL and PVS), our choices are typically limited to relying on Sturm’s theorem to count distinct roots within an interval through signed remainder sequences.

In this paper, we aim to reinforce our root-counting ability in the Isabelle theorem prover (Paulson, 1994). In particular, our main contributions are the following:

• We have mechanised a proof of the Budan-Fourier theorem and a subsequent roots test based on Descartes’ rule of signs and Taylor shift. This roots test efficiently over-approximates the number of real roots within an open interval, counting multiplicity.

• We have made a novel extension to our previous formalisation of Sturm’s theorem to count real roots with multiplicity.

• Benefited from the developments above, we have extended our previous verified complex-root-counting procedures to more general cases: zeros on the border are allowed when counting roots in a half-plane; we can now additionally count roots within a ball.

All results of this paper have been formalised in Isabelle/HOL without using extra axioms, and the source code is available from the following URL:

https://bitbucket.org/liwenda1990/src-cpp-2019

To reuse the results in this paper, consult our entries in the Archive of Formal Proofs (Li, 2018, 2017), which we will keep updating.

This paper continues as follows: after introducing some frequently used notations (§2), we first present a formal proof of the Budan-Fourier theorem (§3), which eventually leads to the Descartes roots test. Next, we extend our previous formalisation of the classical Sturm theorem to count multiple real roots (§4). As an application, we apply those newly-formalised results to improve our previous complex-root-counting procedures (§5). After that, we discuss related work (§6) and some experiments (§7). Finally, we conclude the paper in §8.

## 2. Notations

Below, we will present definitions and proofs in both natural language and the formal language of Isabelle. Some common notations are as follows:

• we often use p and q in our proof scripts to denote polynomials. However, when presenting them in mathematical formulas, we will switch to capitalised letters— and .

• poly p a in Isabelle means : the value of the univariate polynomial evaluated at the point .

• denotes the multiplicity/order of as a root of the polynomial . In Isabelle, this becomes order a p.

• denotes the extended real numbers:

 ¯¯¯¯R=R∪{−∞,+∞}.
• and denote the number of real and complex roots of  counting multiplicity within the set , while in Isabelle they both correspond to proots_count p s—they are distinguished by the type of the polynomial p.

• Similarly, and denote the number of distinct real and complex roots of within . In Isabelle, they correspond to card (proots_within p s) differentiated by the type of the polynomial p.

Many of the theorems presented in this paper will involve sign variations, so we give a definition here:

###### Definition 2.1 (Sign variations).

Given a list of real numbers , we use to denote the number of sign variations after dropping zeros. Additionally, we abuse the notation by letting

 Var([P0,P1,...,Pn];a) =Var([P0(a),P1(a),...,Pn(a)]) Var([P0,P1,...,Pn];a,b) =Var([P0,P1,...,Pn];a) −Var([P0,P1,...,Pn];b),

where is a sequence of univariate polynomials, and is interpreted as the number of sign variations of evaluated at . Finally, given , we also use to denote sign variations of the coefficient sequence of :

 Var(P)=Var([a0,a1,...,an]).
###### Example 2.2 ().

By Definition 2.1, we can have calculations like the following :

 Var([1,−2,0,3])=Var([1,−2,3])=2,Var([x2,x−2];0,1)=Var([x2,x−2];0)−Var([x2,x−2];1)=Var([0,−2])−Var([1,−1])=0−1=−1,Var(1−x2+2x3)=Var([1,0,−1,2])=2.

## 3. From the Budan-Fourier Theorem to the Descartes Roots Test

In this section, we first formalise the proof of the Budan-Fourier theorem. We then apply this to derive Descartes’ rule of signs, which effectively over-approximates the number of positive real roots (counting multiplicity) of a real polynomial by calculating the sign variations of its coefficient sequence. We then use this to show the Descartes roots test111 There does not seem to be a uniform name for this test (Akritas et al., 2008; Akritas, 2008; Collins et al., 2002)—here we follow the one used in Arno Eigenwillig’s PhD thesis (Eigenwillig, 2008) where he refers this test as ”the Descartes test for roots”.: given a polynomial of degree and a bounded interval , we can apply Descartes’ rule of signs to a base-transformed polynomial

 (1) PI(x)=(x+1)nP(ax+bx+1),

to over-approximate the number of real roots of over . Note that the base transformation (1) is commonly referred as Taylor shift in the literature (Kobel et al., 2016).

Our formal proof of the Budan-Fourier theorem and Descartes’ rule of signs roughly follows the textbook by Basu et al. (Basu et al., 2006), while that of the Descartes roots test is inspired by various sources (Collins and Akritas, 1976; Eigenwillig, 2008; Kobel et al., 2016).

### 3.1. The Budan-Fourier Theorem

###### Definition 3.1 (Fourier sequence).

Let be a univariate polynomial of degree . The Fourier sequence of is generated through polynomial derivatives:

 Der(P)=[P,P′,...,P(n)].
###### Theorem 3.2 (The Budan-Fourier theorem).

Let , be two extended real numbers (i.e., ) such that . Through Fourier sequences and sign variations, the Budan-Fourier theorem over-approximates and the difference is an even number:

• and is even.

To prove Theorem 3.2, the key idea is to examine sign variations near a root of :

 Var(Der(P);c−ϵ,c)andVar(Der(P);c,c+ϵ),

where is a possible root of and is a small real number.

Regarding , the property we have derived in Isabelle/HOL is the following:

###### Lemma 3.3 (budanfourierauxleft).

fixes c d1real and preal poly  assumes d1  c and p  0  assumes nonzerox d1  x  x  c                q  set pders p poly q x  0  shows changesitvder d1 c p  order c p          even changesitvder d1 c p  order c p

where

• order c p is : the order/multiplicity of as a root of the polynomial p, as we described in §2;

• changes_itv_der d1 c p stands for ;

• the assumption non_zero asserts that (i.e., poly q x  0) for all and . Here, pders p is the Fourier sequence (i.e., ) and set (pders p) converts this sequence/list into a set of polynomials.

Essentially, in Lemma 3.3 can be considered as , since is asserted to be closer to from the left than any root of polynomials in . Therefore, Lemma 3.3 claims that always exceeds by an even number.

###### Proof of Lemma 3.3.

By induction on the degree of . For the base case (i.e., the degree of is zero), the proof is trivial since both and are equal to 0.

For the inductive case, through the induction hypothesis, we have

 (2) Var(Der(P′);d1,c)≥μ(c,P′)∧even(Var(Der(P′);d1,c)−μ(c,P′)).

First, we consider the case when . In this case, we can derive

 (3) μ(c,P)=μ(c,P′)+1,
 (4) Var(Der(P);d1)=Var(Der(P′);d1)+1,
 (5) Var(Der(P);c)=Var(Der(P′);c).

Combining (2), (3), (4) and (5) yields

 (6) Var(Der(P);d1,c)≥μ(c,P)∧even(Var(Der(P);d1,c)−μ(c,P)),

which concludes the proof.

As for , we can similarly have

 (7)
 (8)

where is the equivalence function in propositional logic. By putting together (3), (7) and (8), we can derive (6) through case analysis, and conclude the whole proof. ∎

Considering , we have an analogous proposition:

###### Lemma 3.4 (budanfourierauxright).

fixes c d2real and preal poly  assumes c  d2 and p  0  assumes x c  x  x  d2                q  set pders p poly q x  0    shows changesitvder c d2 p  0

which indicates that , since can be treated as .

###### Proof of Lemma 3.4.

Similar to that of Lemma 3.3: by induction on the degree of and case analysis. ∎

With Lemma 3.4, we can generalise Lemma 3.3 a bit by allowing in the assumption:

###### Lemma 3.5 (budanfourierauxleft).

fixes c d1real and preal poly  assumes d1  c and p  0  assumes nonzerox d1  x  x  c                q  set pders p poly q x  0  shows changesitvder d1 c p  order c p          even changesitvder d1 c p  order c p

###### Proof.

Let . Lemma 3.4 and 3.3 respectively yield

 (9) Var(Der(P);d1,d)=0,
 (10) Var(Der(P);d,c)≥μ(c,P)∧even(Var(Der(P);d,c)−μ(c,P)).

Moreover, by definition,

 (11) Var(Der(P);d1,c)=Var(Der(P);d1,d)+Var(Der(P);d,c).

From (9), (10) and (11), the conclusion follows. ∎

Finally, we come to our mechanised statement of the bounded interval case of Theorem 3.2:

###### Theorem 3.6 (budanfourierinterval).

fixes a breal and preal poly  assumes a  b and p  0  shows changesitvder a b p              prootscount p x a  x  x  b          even changesitvder a b p                  prootscount p x a  x  x  b

where prootscount p x a  x  x  b denotes , which is the number of real roots of (counting multiplicity) within the interval .

###### Proof of Theorem 3.6.

By induction on the number of roots of polynomials in within the interval . For the base case, we have

 (12) Q(x)≠0,for all x∈(a,b) and Q∈Der(P),

and then, by Lemma 3.5,

 (13) Var(Der(P);a,b)≥μ(b,P)∧even(Var(Der(P);a,b)−μ(b,P)).

 (14) NumR(P;(a,b])=μ(b,P).

We finish the base case by combining (13) and (14).

Regrading the inductive case, let be the largest root within the interval of the polynomials from :

 (15) b′=max{x∈(a,b)∣∃Q∈Der(P). Q(x)=0}.

With the induction hypothesis, we have

 (16) NumR(P;(a,b′])≤Var(Der(P);a,b′)∧even(NumR(P;(a,b′])−Var(Der(P);a,b′)).

Also, considering there is no root of within (otherwise it will be larger than , contradicting (15)), we have

 (17) NumR(P;(a,b])=NumR(P;(a,b′])+μ(b,P).

Finally, Lemma 3.5 yields

 (18) Var(Der(P);b′,b)≥μ(b,P)∧even(Var(Der(P);b′,b)−μ(b,P)).

Putting together (16), (17), and (18) finishes the proof. ∎

Note that Theorem 3.6 only corresponds to the bounded interval case of Theorem 3.2. In the formal development, we also have versions for , or both.

An interesting corollary of the Budan-Fourier theorem is that when all roots are real, the over-approxiamation (i.e., ) becomes exact:

###### Corollary 3.7 (budanfourierreal).

fixes a breal and preal poly  assumes allrootsreal p  shows     prootscount p x x  a  changesleder a p    a  b  prootscount p x a  x  x  b                              changesitvder a b p    prootscount p x b  x  changesgtder b p

where

• allrootsreal p is formally defined as every complex root of having a zero imaginary part,

• changesleder a p encodes ,

• changesitvder a b p encodes ,

• changesgtder b p encodes .

###### Proof of Corollary 3.7.

Let

 t1=Var(Der(P);−∞,a)−NumR(P;(−∞,a])t2=Var(Der(P);a,b)−NumR(P;(a,b])t3=Var(Der(P);a,b)−NumR(P;(b,+∞))

As a result of Theorem 3.2, we have

 (19) t1≥0∧t2≥0∧t3≥0.

Additionally, by the definition of we derive

 (20) Var(Der(P);−∞,a)+Var(Der(P);a,b)+Var(Der(P);a,+∞)=deg(P),

and the assumption (i.e., all roots are real) brings us

 (21) NumR(P;(−∞,a])+NumR(P;(a,b])+NumR(P;(b,+∞))=deg(P).

Joining (20) with (21) yields

 (22) t1+t2+t3=0.

Finally, putting (19) and (22) together concludes the proof. ∎

### 3.2. Descartes’ Rule of Signs

Given , and a polynomial , the Budan-Fourier theorem (Theorem 3.2) in the previous section grants us an effective way to over-approximate (by an even number) through calculating .

Nevertheless, the approximation still requires calculating a Fourier sequence () and a series of polynomial evaluations. When and , the approximation can be refined to counting the number of sign variations of the coefficient sequence of , which requires almost no calculation! Approximating using (rather than ) is the celebrated Descartes’ rule of signs:

###### Theorem 3.8 (descartessign).

fixes preal poly  assumes p  0  shows changes coeffs p              prootscount p x 0  x            even changes coeffs p                  prootscount p x 0  x

where changes coeffs p encodes —sign variations of the coefficient sequence of .

###### Proof.

Let . is as follows:

 (23) [a0+a1x+a2x2+⋯+xn−1xn−1+anxn,a1+2a2x+⋯+(n−1)an−1xn−1+nanxn−1,⋮(n−1)!an−1+n!anxn!an]

where is the factorial of . From (23), it can be derived that has no sign variation when evaluated at :

 (24) Var(Der(P);+∞)=0.

Also, evaluating at gives , hence its sign variations should equal :

 (25) Var(Der(P);0)=Var(P).

Joining (24) and (25) gives , with which we apply Theorem 3.2 to finish the proof. ∎

### 3.3. Base Transformation and Taylor Shift

Given , with Descartes’ rule of signs (Theorem 3.8) in the previous section, we can efficiently approximate . However, in many cases, we are interested in roots within a bounded interval: , where and . Can we still exploit the efficiency from Descartes’ rule of signs? The answer is yes, via an operation to transform into such that

 (26) NumR(P;I)=NumR(PI;(0,+∞)).

In order to develop this transformation operation, we first define the composition of a univariate polynomial and a rational function (i.e., a function of the form , where and are polynomials) in Isabelle/HOL: definition fcompose   a field poly  a poly  a poly  a poly   where    fcompose p q1 q2          fst foldcoeffs a r1r2              r2  a  q1  r1q2  r2 p 01 where

• q1 and q2 are respectively the numerator and denominator of a rational function,

• fst gives the first part of a pair,

• a is a constant polynomial lifted from the value .

Also, foldcoeffs is the classical foldr operation on a coefficient sequence: definition foldcoeffs      azero  b  b  a poly  b  b  where foldcoeffs f p  foldr f coeffs p Essentially, let , , and be three univariate polynomials over some field such that is of degree . Our composition operation over these three polynomials (i.e., fcompose p q1 q2) gives the following polynomial:

 (27) (Q2(x))nP(Q1(x)Q2(x)).

The idea of (27) can be illustrated by the following mechanised lemma:

###### Lemma 3.9 (polyfcompose).

fixes p q1 q2afield poly  assumes poly q2 x  0  shows poly fcompose p q1 q2 x                        poly p poly q1 x  poly q2 x                             poly q2 x  degree p

where poly p x gives the value of the polynomial p when evaluated at x.

When and , (27) yields a transformation (i.e., Taylor shift):

 (28) PI(x)=(x+1)nP(ax+bx+1),

with which we have achieved (26):

###### Lemma 3.10 (prootscountposinterval).

fixes a breal and preal poly  assumes p  0 and a  b  shows prootscount p x a  x  x  b              prootscount fcompose p ba 11                                           x 0  x

where

• ba encodes the polynomial ,

• 11 stands for the polynomial .

### 3.4. The Descartes Roots Test

Finally, we come to the Descartes roots test. Given , and , the Descartes roots test is : the number of sign variations on the coefficient sequence of the Taylor-shifted polynomial : definition descartesrootstest    real  real  real poly  nat   where    descartesrootstest a b p  nat changes                coeffs fcompose p ba 11 where

• fcompose p ba 11 encodes Taylor shift as in (28),

• coeffs converts a polynomial into its coefficient sequence,

• changes calculates the number of sign variations (i.e., ),

• nat converts an integer into a natural number.

Just like , whose root approximation property has been reflected in Theorem 3.6, has a similar theorem related to :

###### Theorem 3.11 (descartesrootstest).

fixes a breal and preal poly  assumes p  0 and a  b  shows prootscount p x a  x  x  b                      descartesrootstest a b p          even descartesrootstest a b p                    prootscount p x a  x  x  b

which claims that always exceeds by an even number.

###### Proof of Theorem 3.11.

Lemma 3.10 yields

 NumR(P;(a,b))=NumR(PI;(0,+∞)),

with which we apply Theorem 3.8 to conclude the proof. ∎

As an approximation, it is natural to ask when the Descartes roots test () is exact. From Theorem 3.11, it is easy to see that it would be exact as least when and . Also, it is analogous to Corollary 3.7 that is exact when all roots are real:

###### Corollary 3.12 (descartesrootstestreal).

fixes a breal and preal poly  assumes allrootsreal p and a  b   shows prootscount p x a  x  x  b                            descartesrootstest a b p

### 3.5. Remarks

Ever since the seminal paper by Collins and Akritas (Collins and Akritas, 1976), the Descartes roots test has been closely linked to modern real root isolation (Kobel et al., 2016; Eigenwillig, 2008; Rouillier and Zimmermann, 2004), where an effective method is needed for testing if an interval has zero or exactly one root. Although Sturm’s theorem (which has already been formalised in Isabelle (Li, 2014; Li and Paulson, 2016a; Eberl, 2015a)

) is also up to the task of root testing, it is considered too slow in modern computer algebra. Our mechanised version of the Descartes roots test is, by no means, state of the art; it is probably the most straightforward and naive implementation. Improvements over our current implementation are mainly about avoiding exact arithmetic, and the approaches include partial Taylor shift

(Kobel et al., 2016) and bitstream arithmetic (Eigenwillig, 2008, Chapter 3).

## 4. Extending Sturm’s Theorem to Exactly Count Multiple Roots

With the Descartes roots test we obtained from the previous section, we have an effective method to over-approximate the number of roots (with multiplicity) within an interval. However, we may sometimes want to know the exact number, as we will describe below (§5). For now, we only have the classical Sturm theorem available (in Isabelle/HOL), which only counts distinct real roots. In this section, we extend our previous formalisation of Sturm’s theorem so that we will be able to count roots with multiplicity and exactly.

Our mechanised proof follows Rahman and Schmeisser (Rahman and Schmeisser, 2002, Theorem 10.5.6).

###### Theorem 4.1 (Sturm’s theorem).

Let , such that , , and . Sturm’s theorem claims

 NumDR(P;(a,b))=Var(SRemS(P,P′);a,b)

where

• is the number of distinct roots of the polynomial within the interval ,

• is the first derivative of ,

• is as in Definition 2.1,

• is the signed remainder sequence:

 (29) [P1,P2,...,Pn],

such that , , (), and .

The core idea of our extended Sturm’s theorem is to extend the remainder sequence ():

###### Definition 4.2 (Extended signed remainder sequence).

Let . The extended signed remainder sequence

 SRemSE(P,Q)=[P1,P2,...,Pm]

is defined as , , and for :

 (30)

until such that by (30).

In Isabelle/HOL, and are respectively mechanised as smods and smods_ext: function smods     real poly  real poly  real poly list   where    smods p q  if p  0 then                    else p  smods q  p mod q                   function smodsext    real poly  real poly  real poly list   where     smodsext p q            if p  0 then                           else if p mod q  0 then               p  smodsext q  p mod q           else               p  smodsext q pderiv q           where [] is an empty list and is the Cons operation on lists—adding one element at the start of a list.

As extends (from the back), it is natural to consider as a prefix of :

###### Lemma 4.3 (smodsextprefix).

fixes p qreal poly  defines r  last smods p q   assumes p  0 and q  0  shows smodsext p q  smods p q                  tl smodsext r pderiv r

where

• last gives the last element of a list,

• @ concatenates two lists,

• tl removes the head of a list,

• pderiv returns the first derivative of a polynomial.

Moreover, we may need to realise that the last element of is actually the greatest common divisor (gcd) of and up to some scalar:

###### Lemma 4.4 (lastsmodsgcd).

fixes p qreal poly  defines r  last smods p q   assumes p  0  shows r  smult leadcoeff r gcd p q

where

• smult multiplies a polynomial with a scalar,

Finally, we can state (the bounded version of) our extended Sturm’s theorem:

###### Theorem 4.5 (sturmextinterval).

fixes a breal and preal poly  assumes a  b and poly p a  0       and poly p b  0  shows prootscount p x a  x  x  b             changesitvsmodsext a b p pderiv p

where changesitvsmodsext a b p pderiv p encodes . Essentially, Theorem 4.5 claims that under some conditions

 NumR(P;(a,b))=Var(SRemSE(P,P′);a,b).
###### Proof of Theorem 4.5.

By induction on the length of , and case analysis on whether . When , the proof is trivial since both and provided .

When , we let be the last element of , and Lemma 4.4 gives us

 (31) R=lc(R)gcd(P,P′),

where is the leading coefficient of .

The essential part of the proof is to relate and :

 NumR(P;(a,b)) (32) =∑x:P(x)=0∧x∈(a,b)μ(x,P) (33) =∑x:P(x)=0∧x∈(a,b)(1+μ(x,R)) (34) =NumDR(P;(a,b))+∑x:P(x)=0∧x∈(a,b)μ(x,R) (35) =NumDR(P;(a,b))+∑x:R(x)=0∧x∈(a,b)μ(x,R) (36) =NumDR(P;(a,b))+NumR(R;(a,b)).

In particular, (33) has been derived by

 μ(x,P)=1+μ(x,P′)=1+min(μ(x,P′),μ(x,P))=1+μ(x,gcd(P,P′))=1+μ(x,R),

provided and (31). Also, (35) is because and for all such that and . With (32) - (36), we have

 (37)

Moreover, the induction hypothesis yields

 (38) NumR(R;(a,b))=Var(SRemSE(R,R′);a,b),

and the classical Sturm theorem (Theorem 4.1) yields

 (39) NumDR(P;(a,b))=Var(SRemS(P,P′);a,b).

Also, by joining Lemma 4.3 and definition of , we may have

 (40) Var(SRemSE(P,P′);a,b)=Var(SRemS(P,P′);a,b)+Var(SRemSE(R,R′);a,b).

Finally, putting together (37), (38), (39), and (40) yields

 NumR(P;(a,b))=Var(SRemSE(P,P′);a,b),

concluding the proof. ∎

Be aware that Lemma 4.5 only corresponds to the bounded version of the extended Sturm’s theorem. Our formal development also contains unbounded versions (i.e., when or ).

## 5. Applications to Counting Complex Roots

In the previous sections (§3 and §4), we have demonstrated our enhancements for counting real roots in Isabelle/HOL. In this section, we will further apply those enhancements to improve existing complex-root-counting procedures (Li and Paulson, 2018).

In particular, we will first review the idea of counting complex roots through Cauchy indices in §5.1. After that, we will apply the extended Sturm’s theorem (§4) to remove the constraint of forbidding roots on the border when counting complex roots in the upper half-plane (§5.2). In §5.3, we will combine the improved counting procedure (for roots in the upper half-plane) and the base transformation in §3.3 to build a verified procedure to count complex roots within a ball. Finally, we give some remarks about counting complex roots (§5.4).

### 5.1. Number of Complex Roots and the Cauchy Index

In this section we will briefly review the idea of counting complex roots through Cauchy indices, For a more detailed explanation, the reader can refer to our previous work (Li and Paulson, 2018).

Thanks to the argument principle, the number of complex roots can be counted by evaluating a contour integral:

 (41) 12πi∮γP′(x)P(x)dx=N,

where , is the first derivative of and is the number of complex roots of (counting multiplicity) inside the loop . Also, by the definition of winding numbers, we have

 (42) n(P∘γ,0)=12πi∮γP′(x)P(x)dx,

where is function composition and is the winding number of the path around . Combining (41) and (42) enables us to count (complex) roots by evaluating a winding number:

 (43) N=n(P∘γ,0).

Now the question becomes how to evaluate the winding number . One of the solutions is to utilise the Cauchy index.

To define the Cauchy index, we need to first introduce the concept of jumps:

###### Definition 5.1 (Jump).

For and , we define

We can now proceed to define Cauchy indices by summing up these jumps over an interval and along a path.

###### Definition 5.2 (Cauchy index).

For and , the Cauchy index of over the interval is defined as

 Indba(f)=∑x∈[a,b)jump+(f,x)−∑x∈(a,b]jump−(f,x).
###### Definition 5.3 (Cauchy index along a path).

Given a path and a point , the Cauchy index along about is defined as

 Indp(γ,z0)=Ind10(f),

where

 f(t)=Im(γ(t)−z0)Re(γ(t)−z0).

As the Cauchy index captures the way that crosses the line , we can evaluate the winding number through the Cauchy index:

###### Theorem 5.4 ().

Given a valid path and a point , such that is a loop and is not on the image of , we have

 n(γ,z0)=−Indp(γ,z0)2.

Combining Theorem 5.4 and (43) gives us a way to count complex polynomial roots:

 (44) N=−Indp(P∘γ,z0)2.

What is more interesting is that (or ) can be calculated through remainder sequences and sign variations when (or ) is a rational function. That is, the right-hand side of (44) becomes executable, and we have a procedure to count .

### 5.2. Resolving the Root-on-the-Border Issue when Counting Roots within a Half-Plane

Fundamentally, the complex-root-counting procedure in the previous section relies on the winding number and the argument principle, both of which disallow roots of on the border . As a result, both mechanised procedures — counting roots within a rectangle and within a half-plane — in our previous work will fail whenever there is a root on the border. In this section, we will utilise our newly mechanised extended Sturm’s theorem to resolve the root-on-the-border issue when counting roots within a half-plane. Note that the root-on-the-border issue for the rectangular case, unfortunately, remains: we leave this issue for future work.

Considering that any half-plane can be transformed into the upper half-plane through a linear-transformation, we only need to focus on the upper-half-plane case: definition prootsupper  complex poly  nat     where  prootsupper p  prootscount p z Im z  0 where prootsupper p encodes

— the number of complex roots of within the upper half-plane .

Previously, we relied on the following lemma to count :

###### Lemma 5.5 (prootsuppercindexeq).

fixes pcomplex poly  assumes leadcoeff p  1       and norealroots xproots p Im x  0     shows prootsupper p  degree p                cindexpolyubd mappoly Im p                               mappoly Re p  2

where

• leadcoeff p  1 asserts the polynomial to be monic,

• the assumption no_real_roots asserts that does not have any root on the real axis (i.e., the border). This assumption is, as mentioned earlier, because the argument principle disallows roots on the border.

• cindexpolyubd mappoly Im p mappoly Re p encodes the Cauchy index

 Ind+∞−∞(λx.Im(P(x))Re(P(x))),

which can be computed (through remainder sequences and sign variations) due to being a rational function.

To solve the root-on-the-border issue in Lemma 5.5, we observe the effect of removing roots from a horizontal border:

###### Lemma 5.6 (cindexErootsonhorizontalborder).

fixes p q r complex poly and stcomplex     and a b sreal  defines linepath st st  ofreal s  assumes p  q  r and leadcoeff r  1       and xproots r Im x  Im st    shows cindexE a b t Im poly p   t                            Re poly p   t           cindexE a b t Im poly q   t                            Re poly q   t

where the polynomial is the result of after removing some roots on the horizontal border . Lemma 5.6 claims that

 Indab(λx.Im(P(γ(x))Re(P(γ(x)))=Indab(λx.Im(Q(γ(x))Re(Q(γ(x))).

That is, the Cauchy index will remain the same if we only drop roots on a horizontal border.

We can now refine Lemma 5.5 by dropping the no_real_roots assumption:

###### Lemma 5.7 (prootsuppercindexeq).

fixes pcomplex poly  assumes leadcoeff p  1  shows prootsupper p                degree p  prootscount p x Im x0               cindexpolyubd mappoly Im p                                 mappoly Re p 2

To compare Lemma 5.7 with Lemma 5.5, we may note there is an extra term prootscount p x Im x0 in the conclusion. This term encodes , and we can have

 (45) NumC(P;{z∣Im(z)=0})=NumR(gcd(Re(P),Im(P));(−∞,+∞)),

where are respectively the real and complex part of a complex polynomial such that . The rationale behind (45) is that each root of on the real axis () is actually real and is also a root of both and . More importantly, the right-hand side of (45) is where we will apply our extended Sturm’s theorem in §4.

###### Proof of Lemma 5.7.

Let be the polynomial after removing all the roots on the border (i.e., the real axis) such that

 (46) Im(z)≠0 whenever z is a complex root of Q.

By the definition of and (46), we can apply Lemma 5.5 to derive

 (47) NumC(P;{z∣Im(z)>0})=NumC(Q;{z∣Im(z)>0})=deg(Q)−Ind+∞−∞(λx.Im(Q(x)Re(Q(x)).

Moreover, and are related by the fundamental theorem of algebra:

 (48) deg(Q)=deg(P)−NumC(P;{z∣Im(z)=0}),

and Lemma 5.6 brings us the equivalence between two Cauchy indices:

 (49) Ind+∞−∞(λx.Im(Q(x)Re(Q(x))=Ind+∞−∞(λx.Im(P(x)Re(P(x)).

Putting (47), (48), and (49) together yields

 (50) NumC(P;{z∣Im(z)>0})=deg(P)−NumC(P;{z∣Im(z)=0})−Ind+∞−∞(λx.Im(P(x)Re(P(x)),

which concludes the proof. ∎

Finally, we can have a code equation (i.e., executable procedure) from the refined Lemma 5.7 to compute :

###### Lemma 5.8 (prootsuppercode1code).

prootsupper p      if p  0 then       let pm  smult inverse leadcoeff p p            pI  mappoly Im pm            pR  mappoly Re pm            g  gcd pI pR        in            nat degree p                      changesRsmodsext g pderiv g                     changesRsmods pR pI div 2                             else       Codeabort STR prootsupper fails when p0                                  prootsupper p

where

• pm is a monic polynomial produced by divided by its leading coefficient, and this monic polynomial is required by the assumption leadcoeff p  1 in Lemma 5.7.

• changesRsmodsext g pderiv g encodes

 Var(SRemSE(G,G′);−∞,+∞)where G=gcd(Re(P),Im(P)),

which computes in Lemma 5.7. Note that this part is where our extended Sturm’s theorem in §4 has been utilised.

• changesRsmods pR pI stands for

 Var(SRemS(PR,PI);−∞,+∞),

which computes in Lemma 5.7.

• The command Codeabort raises an exception in the case of .

Overall, Lemma 5.8 asserts that

 NumC(P;{z∣Im(z)>0})

is equivalent to an executable expression: the right-hand side of Lemma 5.8. Because it is declared as a code equation, it implements a verified procedure to compute . We can now type the following command in Isabelle/HOL: value prootsupper 1 2 1 to compute , which was not possible in our previous work (Li and Paulson, 2018) since the polynomial has a root on the border (i.e., the real axis).

### 5.3. Counting Roots within a Ball

In this section, we will introduce a verified procedure to count , the number of complex roots of a polynomial within the ball . The core idea is to use the base transformation in §3.3 to convert the current case to the one of counting roots within the upper half-plane, and then make use of the procedure in §5.2 to finish counting.

Let prootsball denote : definition prootsball    complex poly  complex  real  nat   where     prootsball p z0 r  prootscount p ball z0 r

With the transformation operation (fcompose) we developed in §3.3, we can derive the following equivalence relation in the number of roots:

###### Lemma 5.9 (prootsballplaneeq).

fixes pcomplex poly  assumes p  0  shows prootscount p ball 0 1              prootscount fcompose p 1 1                                          z 0  Im z

That is,

 (51) NumC(P;{z∣|z|<1})=NumC((i+x)nP(i−xi+x);{z∣Im(z)>0}),

where is the degree of .

Moreover, we can relate roots between different balls using normal polynomial composition:

###### Lemma 5.10 (prootsuballeq).

fixes pcomplex poly and z0complex and rreal  assumes p  0 and r  0  shows prootscount p ball z0 r                 prootscount p p z0 ofreal r                                            ball 0 1

where p encodes the composition operation between two polynomials. Overall, Lemma 5.10 claims

 (52)