Implementation of a Near-Optimal Complex Root Clustering Algorithm

06/27/2018 ∙ by Rémi Imbach, et al. ∙ CUNY Law School NYU college Technische Universität Kaiserslautern 0

We describe Ccluster, a software for computing natural ϵ-clusters of complex roots in a given box of the complex plane. This algorithm from Becker et al. (2016) is near-optimal when applied to the benchmark problem of isolating all complex roots of an integer polynomial. It is one of the first implementations (Irina Voiculescu informed us that her student Dan-Andrei Gheorghe has independently implemented the same algorithm in a Masters Thesis Project (May 18, 2017) at Oxford University. Sewon Park and Martin Ziegler at KAIST, Korea, have implemented a modified version of Becker et al.(2016) for polynomials having only real roots being the eigenvalues of symmetric square matrices with real coefficients) of a near-optimal algorithm for complex roots. We describe some low level techniques for speeding up the algorithm. Its performance is compared with the well-known MPSolve library and Maple.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The problem of root finding for a polynomial is a classical problem from antiquity, but remains the subject of active research to the present [6]. We consider a classic version of root finding:

Local root isolation problem: Given: a polynomial , a box , . Output: a set of pairwise-disjoint discs of radius , each containing a unique root of in .

It is local because we only look for roots in a locality, as specified by . The local problem is useful in applications (especially in geometric computation) where we know where to look for the roots of interest. There are several variants of this problem: in the global version, we are not given , signifying that we wish to find all the roots of . The global version is easily reduced to the local one by specifying a that contains all roots of . If we omit , it amounts to setting , representing the pure isolation problem.

Our main interest is a generalization of root isolation, to the lesser-studied problem of root clustering [10, 12, 8]. It is convenient to introduce two definitions: for any set , let denote the set of roots of in , and let count the total multiplicity of the roots in . Typically, is a disc or a box. For boxes and discs, we may write (for any ) to denote the dilation of by factor , keeping the same center. The following problem was introduced in [16]:

Local root clustering problem: Given: a polynomial , a box , . Output: a set of pairs where ’s are pairwise-disjoint discs of radius , for all , and

This generalization of root isolation is necessary when we consider polynomials whose coefficients are non-algebraic (or when is an analytic function, as in [16]). The requirement that ensures that our output clusters are natural [1]; a polynomial of degree has at most natural clusters (see [16, Lemma 1]). The local root clustering algorithm for analytic functions of [16] has termination proof, but no complexity analysis. By restricting to a polymomial, Becker et al. [2] succeeded in giving an algorithm and also its complexity analysis based on the geometry of the roots. When applied to the benchmark problem, where is an integer polynomial of degree with -bit coefficients, the algorithm can isolate all the roots of with bit complexity . Pan [13] calls such bounds near-optimal (at least when ). The clustering algorithm studied in this paper comes from [1], which in turn is based on [2]. Previously, the Pan-Schönhage algorithm has achieved near-optimal bounds with divide-and-conquer methods [13], but [2, 1] was the first subdivision algorithm to achieve the near-optimal bound for complex roots. For real roots, Sagraloff-Mehlhorn [15] had earlier achieved near-optimal bound via subdivision.

Why the emphasis on “subdivision”? It is because such algorithms are implementable and quite practical (e.g., [14]). Thus the near-optimal real subdivision algorithm of [15] was implemented shortly after its discovery, and reported in [11] with excellent results. In contrast, all the asymptotically efficient root algorithms (not necessarily near-optimal) based on divide-and-conquer methods of the last 30 years have never been implemented; a proof-of-concept implementation of Schönhage’s algorithm was reported in Gourdon’s thesis [9]). Computer algebra systems mainly rely on algorithms with a priori guarantees of correctness. But in practice, algorithms without such guarantees are widely used. For complex root isolation, one of the most highly regarded multiprecision software is MPSolve [3]. The original algorithm in MPSolve was based on Erhlich-Aberth (EA) iteration; but since 2014, a “hybrid” algorithm [4] was introduced. It is based on the secular equation, and combines ideas from EA and eigensolve  [7]. These algorithms are inherently global solvers (they must approximate all roots of a polynomial simultaneously). Another theoretical limitation is that the global convergence of these methods is not proven.

In this paper, we give a preliminary report about Ccluster, our implementation of the root clustering algorithm from [1].

Figure 1: Left: the connected components isolating all roots of the Bernoulli polynomial of degree 100. Right: the connected components isolating all roots of the Spiral polynomial of degree 64.

To illustrate the performance for the local versus global problem, consider the Bernoulli polynomials where ’s are the Bernoulli numbers. Figure 1(Left) shows the graphical output of Ccluster for . Table 1 has four timings (for ) in seconds: is the time for solving the local problem over a box ; is the time for the global problem over the box (which contains all the roots). The other two timings from MPSolve ( for unisolve, for secsolve) will be explained later. For each instance, we also indicate the numbers of solutions (#Sols) and clusters (#Clus). When #Sols equals #Clus, we know the roots are isolated. Subdivision algorithms like ours naturally solve the local problem, but MPSolve can only solve the global problem. Table 1 shows that MPSolve remains unchallenged for the global problem. But in applications where locality can be exploited, local methods may win, as seen in the last two rows of the table. The corresponding time for Maple’s fsolve is also given; fsolve is not a guaranteed algorithm and may fail.

Ccluster  local () Ccluster  global () unisolve secsolve fsolve
d (#Sols:#Clus) (depth:size) (s) (#Sols:#Clus) (depth:size) (s) (s) (s) (s)
64 (4:4) (9:164) 0.12 (64:64) (17:1948) 2.10 0.13 0.01 0.1
128 (4:4) (9:164) 0.34 (128:128) (16:3868) 9.90 0.55 0.05 6.84
191 (5:5) (9:196) 0.69 (191:191) (17:5436) 32.5 2.29 0.16 50.0
256 (4:4) (9:164) 0.96 (256:256) (17:7300) 60.6 3.80 0.37
383 (5:5) (9:196) 2.06 (383:383) (17:11188) 181 1.17
512 (4:4) (9:164) 2.87 (512:512) (16:14972) 456 3.63
767 (5:5) (9:196) 6.09 (767:767) (17:22332) 1413 10.38
Table 1: Bernoulli Polynomials with five timings: local (), global (), unisolve (), secsolve () and Maple’s fsolve().

Overview of Paper In Section 2, we describe the experimental setup for Ccluster. Sections 3-5 describe some techniques for speeding up the basic algorithm. We conclude with Section 6.

2 Implementation and Experiments

The main implementation of Ccluster is in C language. We have an interface for Julia222https://julialang.org/. Download our code in https://github.com/rimbach/Ccluster. . We based our big number computation on the arb333http://arblib.org/. Download our code in https://github.com/rimbach/Ccluster.jl. library. The arb library implements ball arithmetic for real numbers, complex numbers and polynomials with complex coefficients. Each arithmetic operation is carried out with error bounds.

Test Suite

We consider 7 families of polynomials, classic ones as well as some new ones constructed to have interesting clustering or multiple root structure.

  1. The Bernoulli polynomial of degree is described in Section 1.

  2. The Mignotte polynomial for a positive integer , has two roots whose separation is near the theoretical minimum separation bound.

  3. The Wilkinson polynomials .

  4. The Spiral Polynomial See Figure 1(Right) for .

  5. Wilkinson Multiple: . has degree where the root has multiplicity (for ).

  6. Mignotte Cluster: . This polynomial has degree (assuming ) and has a cluster of roots near and a cluster of roots near .

  7. Nested Cluster: has degree and is defined by induction on : with roots where . Inductively, if the roots of are , then we define See Figure 2 for the natural -clusters of .

Figure 2: Left: 3 clusters of found with . Right: Zoomed view of 9 clusters of found with . Note: The initial box is in thick lines; the thin lines show the subdivisions tree.

Timing

Running times are sequential times on a Intel(R) Core(TM) i3 CPU 530 @ 2.93GHz machine with linux. Ccluster implements the algorithm described in [1] with differences coming from the improvements described in Sections 3-5 below. Unless explicitly specified, the value of for Ccluster is set to ; roughly speaking, it falls back to asking for 15 guaranteed decimal digits.

MPSolve

For external comparison, we use MPSolve. It was shown to be superior to major software such as Maple or Mathematica [3]. There are two root solvers in MPSolve: the original unisolve [3] which is based on the Ehrlich-Aberth iteration and the new hybrid algorithm called secsolve [4]. These are called with the commands mpsolve -au -Gi -o -j1 and mpsolve -as -Gi -o -j1 (respectively). -Gi means that MPSolve tries to find for each root a unique complex disc containing it, such that Newton iteration is guaranteed to converge quadratically toward the root starting from the center of the disc. -o means that is used as an escape bound, i.e., the algorithm stops when the complex disc containing the root has radius less that , regardless of whether it is isolating or not. Unless explicitly specified, we set . -j1 means that the process is not parallelized. Although MPSolve does not do general local search, it has an option to search only within the unit disc. This option does not seem to lead to much improvement.

3 Improved Soft Pellet Test

is implicit argument Output:           ASSERT: if , then       , , ,       getApproximation( , )       TaylorShift( , )       While           Let be the -th Graeffe iteration of                     While               IntCompare( , ,) 444 compares -bit approximations of real numbers and . It returns true (resp. false) only if (resp. or and are closest than a constant). It returns unresolved when is too small to conclude.               While                                     getApproximation( , )                   TaylorShift( )                   Let be -th Graeffe iteration of                   IntCompare( , ,)               If then Return                                Return 

Figure 3: . is the absolute value of the coefficient of the monomial of degree of , for .

The key predicate in [1] is a form of Pellet test denoted (with implicit ). This is modified in Figure 3 by adding an outer while-loop to control the number of Graeffe-Dandelin iterations. We try to get a definite decision (i.e., anything other than a unresolved) from the soft comparison for the current Graeffe iteration. This is done by increasing the precision for approximating the coefficients of in the innermost while-loop. Thus we have two versions of our algorithm: (V1) uses the original in [1], and (V2) uses the modified form in Figure 3. Let V1 and V2 be timings for the 2 versions. Table 2 shows the time V1 (in seconds) and the ratio V1/V2. We see that (V2) achieves a consistent 2.3 to 3-fold speed up.

V1 V2 V3
(n1, n2, n3) V1 (n1, n2, n3) V1/V2 (n1, n2, n3) V1/V3
(2308,686,20223) 19.6 (2308,686,6028) 2.84 (2308,8,2291) 7.06
(2060,622,18018) 17.3 (2060,622,5326) 3.03 (2060,20,2080) 7.68
(2148,674,18053) 23.6 (2148,674,5692) 2.74 (2148,0,2140) 7.23
(2512,728,22176) 22.2 (2512,728,6596) 2.39 (2512,15,2670) 4.46
(724,202,6174) 9.69 (724,202,2684) 2.30 (724,18,2065) 3.37
(2092,618,18515) 20.0 (2092,618,5600) 3.00 (2092,12,2481) 6.57
(3532,1001,30961) 90.2 (3532,1001,9654) 3.09 (3532,24,4588) 6.81
Table 2: Solving within the initial box with with versions (V1), (V2) and (V3) of Ccluster. n1: number of discarding tests. n2: number of discarding tests returning -1 (inconclusive). n3: total number of Graeffe iterations. V1 (resp. V2, V3): sequential time for V1 (resp. V2, V3) in seconds.

In (V2), as in [1], we use (defined as ) to prove that a box has no root. We propose a new version (V3) that uses (defined as , where is the degree of ) instead of to achieve this goal: instead of just showing that has no root, it upper bounds . Although counter-intuitive, this yields a substantial improvement because it led to fewer Graeffe iterations overall. The timing for (V3) is V3, but we display only the ratio V1/V3 in the last column of Table 2. This ratio shows that (V3) enjoys a 3.3-7.7 fold speedup. Comparing for (V2) and (V3) explains this speedup.

4 Filtering

A technique for speeding up the evaluation of predicates is the idea of filters (e.g., [5]). The various Pellet tests can be viewed as a box predicate that maps a box to a value555 We treat two-valued predicates for simplicity; the discussion could be extended to predicates (like ) which returns a finite set of values. in . If is another box predicate with property that implies , we call a falsehood filter. If is efficient relatively to , and “efficacious” (informally, is likely to yield ), then it is useful to first compute . If , we do not need to compute . The predicate used in Ccluster  is defined as follows: is true if returns (then contains no root of ) and is false if returns or (then may contain some roots of ). We next present the falsehood filter for .

Let denote the Taylor shift of in , its -th Graeffe iterate, the -th coefficient of , and the absolute value of the -th coefficient. Let be the degree of . The assertion below is a direct consequence of the classical test of Pellet (see [2][p. 12]) and justify the correctness of our filters:
if then returns or .
Our filter computes , and and checks hypothesis of using IntCompare. and can respectively be computed as and . can be computed with the following well known formula:

(1)

Obtaining with eq. (1) requires to know coefficients of , coefficients of and finally coefficients of . In particular, it requires to compute entirely the iterations such that , and it is possible to do it more efficiently that with eq. (1) (for instance with the formula given in definition 2 of [2]).

Our filter takes as input a precision , the Taylor shift of the bit approximation of and its -th Graeffe iteration such that and . It computes , and the first coefficients of for with eq. (1). Then it checks the hypothesis of using IntCompare, and returns false if it is verified, and true otherwise. In practice, it is implemented within the procedure implementing .

Incorporating into Version (V3), we obtain (V4) and the speed up can be seen in Table 3. Filtering with becomes more effective as degree grows and this is because one has for smaller (recall that ).

V3 V4
n3 V3 n3 V3/V4
2291 2.61 2084 1.08
4496 14.5 3983 1.13
8847 94.5 7714 1.19
15983 620 11664 1.42
19804 1832 13863 1.53
2080 2.41 1808 1.22
3899 12.1 3257 1.21
7605 88.3 6339 1.33
15227 674 10405 1.57
2140 3.27 1958 1.05
2240 10.0 1942 1.09
2414 36.6 2108 1.21
2557 129 1841 1.43
2670 4.43 2364 1.08
5090 28.8 4405 1.07
9746 182 8529 1.10
19159 1340 14786 1.19
2065 2.87 1818 1.14
2313 3.95 2053 1.12
2649 5.89 2336 1.18
2892 8.56 2537 1.29
2481 2.94 2145 1.13
4166 14.4 3555 1.16
7658 86.0 6523 1.27
15044 650 10472 1.63
1628 0.77 1459 1.07
4588 13.2 4085 1.12
13056 358 11824 1.26
Table 3: Solving within the initial box with with versions (V3), (V4) of Ccluster. n3: number of Graeffe iterations. V3 and V4: sequential time in seconds.

5 Escape Bound

The parameter is usually understood as the precision desired for roots. But we can also view it as an escape bound for multiple roots as follows: we do not refine a disc that contains a simple root, even if its radius is . But for clusters of size greater than one, we only stop when the radius is . MPSolve has a similar option. This variant of (V4) is denoted (V4’). We see from Table 4 that (V4’) gives a modest improvement (up to 25% speedup) over (V4) when . This improvement generally grows with (but shows no difference).

(V4) (V4’)
:   
53 (s) 530/53 5300/53 53 (s) 530/53 5300/53
2.42 1.26 4.22 1.99 0.94 0.94
1.97 1.63 4.56 1.61 1.45 1.38
3.22 1.10 2.16 2.91 0.96 1.01
4.09 1.33 5.25 3.05 0.95 0.95
2.51 1.12 2.03 2.50 1.13 1.98
2.60 1.89 4.15 2.20 1.70 1.80
11.9 1.08 2.67 10.4 1.00 0.99
Table 4: Solving within the box with versions (V4) and (V4’) of Ccluster  with three values of . 53 (resp. 530, 5300): sequential time for (V4) and (V4’) in seconds.

6 Conclusion

Implementing subdivision algorithms is relatively easy but achieving state-of-art performance requires much optimization and low-level development. This paper explores several such techniques. We do well compared to fsolve in Maple, but the performance of MPSolve is superior to the global version of Ccluster. But Ccluster can still shine when looking for local roots or when is large.

References

  • [1] R. Becker, M. Sagraloff, V. Sharma, J. Xu, and C. Yap. Complexity analysis of root clustering for a complex polynomial. In Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation, pages 71–78. ACM, 2016.
  • [2] R. Becker, M. Sagraloff, V. Sharma, and C. Yap. A near-optimal subdivision algorithm for complex root isolation based on the pellet test and newton iteration. Journal of Symbolic Computation, 86:51–96, 2018.
  • [3] D. A. Bini and G. Fiorentino. Design, analysis, and implementation of a multiprecision polynomial rootfinder. Numerical Algorithms, 23(2-3):127–173, 2000.
  • [4] D. A. Bini and L. Robol. Solving secular and polynomial equations: A multiprecision algorithm. Journal of Computational and Applied Mathematics, 272:276–292, 2014.
  • [5] H. Brönnimann, C. Burnikel, and S. Pion. Interval arithmetic yields efficient dynamic filters for computational geometry. Discrete Applied Mathematics, 109(1-2):25–47, 2001.
  • [6] I. Z. Emiris, V. Y. Pan, and E. P. Tsigaridas. Algebraic algorithms. In Computing Handbook, Third Edition: Computer Science and Software Engineering, pages 10: 1–30. Chapman and Hall/CRC, 2014.
  • [7] S. Fortune. An iterated eigenvalue algorithm for approximating roots of univariate polynomials. Journal of Symbolic Computation, 33(5):627–646, 2002.
  • [8] M. Giusti, G. Lecerf, B. Salvy, and J.-C. Yakoubsohn. On location and approximation of clusters of zeros of analytic functions. Foundations of Computational Mathematics, 5(3):257–311, 2005.
  • [9] X. Gourdon. Combinatoire, Algorithmique et Géométrie des Polynomes. PhD thesis, École Polytechnique, 1996.
  • [10] V. Hribernig and H. J. Stetter. Detection and validation of clusters of polynomial zeros. Journal of Symbolic Computation, 24(6):667–681, 1997.
  • [11] A. Kobel, F. Rouillier, and M. Sagraloff. Computing real roots of real polynomials… and now for real! In Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation, pages 303–310. ACM, 2016.
  • [12] X.-M. Niu, T. Sakurai, and H. Sugiura. A verified method for bounding clusters of zeros of analytic functions. Journal of computational and applied mathematics, 199(2):263–270, 2007.
  • [13] V. Y. Pan. Univariate polynomials: nearly optimal algorithms for numerical factorization and root-finding. Journal of Symbolic Computation, 33(5):701–733, 2002.
  • [14] F. Rouillier and P. Zimmermann. Efficient isolation of polynomial’s real roots. Journal of Computational and Applied Mathematics, 162(1):33–50, 2004.
  • [15] M. Sagraloff and K. Mehlhorn. Computing real roots of real polynomials. Journal of Symbolic Computation, 73:46–86, 2016.
  • [16] C. Yap, M. Sagraloff, and V. Sharma. Analytic root clustering: A complete algorithm using soft zero tests. In Conference on Computability in Europe, pages 434–444. Springer, 2013.