1 Introduction
This
paper
considers the fundamental problem of finding the complex
solutions of a system f(z)=0
of n polynomial equations in n complex variables
z=(z1,…,zn). The system f=(f1,…,fn):Cn→Cn is triangular in the sense
that fi∈C[z1,…,zi] for 1≤i≤n,
where dzi(fi)≥1.
As in [7], we assume
that the system is regular: this means that
for each i, if (α1,…,αi−1) is a zero of
fi−1 and ci(z1,…,zi−1) is the leading coefficient
of zi in fi, then ci(α1,…,αi−1)≠0.
Thus f is a 0dimensional system.
But unlike [7], we
do not assume that the system is squarefree:
indeed the goal of this paper is to demonstrate
new techniques that can properly
determine the multiplicity of the root clusters of f,
up to any ϵ>0 resolution.
Throughout
this paper, we use boldface symbols to denote vectors and tuples; for
instance
0 stands for
(0,…,0).
We are interested in finding clusters of solutions of triangular systems
and in counting the total multiplicity of solutions in clusters.
Solving triangular systems is a fundamental task in polynomial equations
solving, since there are many algebraic techniques to
decompose the original system into triangular systems.
The problem of isolating the complex solutions of a polynomial system
in an initial regionofinterest (ROI) is defined as follows: let
Zero(B,f) denote the set of solutions of f in
B, regarded as
a multiset.
Local Isolation Problem (LIP):
Given: a polynomial map f:Cn→Cn, a polybox B⊂Cn, ϵ>0
Output: a set {Δ1,…,Δl} of
pairwise disjoint polydiscs of radius ≤ϵ where
Output:  Zero(B,f)=⋃lj=1Zero(Δj,f).
Output:  each Zero(Δj,f) is a
singleton.
This is “local” because we restrict attention to roots in a ROI
B. There are two issues with (LIP) as formulated above:
deciding if Zero(Δj,f) is a singleton, and deciding if
such a singleton lies in B, are two “zero problems” that
require exact computation. Generally, this can only be decided if
f is algebraic. Even in the algebraic case, this may be very
expensive. In [27, 4] these
two issues are sidestepped by defining the local clustering problem
which is described next.
Before proceeding, we fix some general notations for this paper. A
polydisc Δ is a vector (Δ1,…,Δn) of
complex discs. The center of Δ is the vector of the
centers of its components and the radius r(Δ)
of Δ
is the vector of the radii
of its components. If δ is any positive real number, we denote
by δΔ the polydisc
(δΔ1,…,δΔn) that has the same center
as Δ and radius δr(Δ).
We also say r(Δ)≤δ if each component of
r(Δ) is ≤δ. A (square complex) box B is
a complex interval [ℓ1,u1]+i([ℓ2,u2]) where
u2−ℓ2=u1−ℓ1 and i:=√−1; the width
0ptB of B is u1−ℓ1 and the center of B is
u1+0ptB2+i(u2+0ptB2). A
polybox
B⊆Cn
is the set ∏ni=1Bi which is represented by the
vector (B1,…,Bn) of boxes.
The center of B is the vector of the centers of its
components; the width 0ptB of B is the max of
the widths of its components.
If δ is any positive real number, we denote by δB
the polybox (δB1,…,δBn) that has the same
center than B and width δ0ptB.
It is also convenient to identify Δ as the subset
∏ni=1Δi of Cn; a similar remark applies
to B.
We introduce three notions to define the local solution clustering
problem. Let a∈Cn be a solution of
f(z)=0.
The multiplicity of a in f, also called the
intersection multiplicity of a in f is classically
defined by localization of rings as in [28, Def. 1,
p. 61], we denote it by #(a,f). An
equivalent definition uses dual spaces, see [12, Def. 1,
p. 117].
For any set S⊆Cn, we denote by Zero(S,f) the
multiset of zeros (i.e., solutions) of f in S, and
#(S,f) the total multiplicity of Zero(S,f).
If S is a polydisc, we
call Zero(S,f) a cluster if it is nonempty, and S is
an isolator of the cluster. If in addition, we have that
Zero(S,f)=Zero(3⋅S,f), we call
Zero(S,f) a natural cluster and call S a natural
isolator.
In the context of numerical algorithm, the notion of cluster of
solutions is more meaningful than that of solution with multiplicity
since the perturbation of a multiple solution generates a cluster. We
thus “soften” the problem of isolating the solutions of a triangular
system of polynomial equations while counting their multiplicities by
translating it into the local solution clustering problem defined as
follows:
Local Clustering Problem (LCP):
Given: a polynomial map f:Cn→Cn, a polybox B⊂Cn, ϵ>0
Output: a set of pairs
{(Δ1,m1),…,(Δl,ml)} where:
Output:  the Δjs are pairwise disjoint
polydiscs of radius ≤ϵ,
Output:  each mj=#(Δj,f)=#(3Δj,f)
Output:  Zero(B,f)⊆⋃lj=1Zero(Δj,f)⊆Zero(2B,f).
In this (LCP) reformulation of (LIP), we have removed the two “zero
problems” noted above: we output clusters to avoid the first problem,
and we allow the output to contain zeroes outside the ROI B to
avoid the second one. We choose 2B for simplicity; it is easy
to replace the factor of 2 by 1+δ for any desired δ>0.
Overview.
In the remaining of this section we explain our
contribution, summarize previous work and the local univariate
clustering method of [4]. In
Sec. 2, we define the notion of tower
of clusters together with a recursive method to compute the sum of
multiplicities of the solutions it contains.
Sec. 3 analyzes the loss of precision induced by
approximate specialization.
Our algorithm for solving
the local clustering problem for triangular systems is
introduced in Sec. 4. The implementation and
experimental results are presented in Sec. 5.
1.1 Our contributions
We propose an algorithm for solving the complex clustering
problem for a triangular system f(z)=0 with a
zerodimensional solution set. To this end, we propose a formula to
count the sum of multiplicities of solutions in a cluster.
Our formula is derived from a result of [28] that
links the intersection multiplicity of a solution of a triangular
system to multiplicities in fibers.
We define towers of clusters to encode clusters of solutions
of a triangular system in stacks (or towers) of clusters of roots of
univariate polynomials.
Our algorithm exploits the triangular form of f=(f1,…,fn):
the standard idea is to recursively find roots
of the form (α1,…,αn−1) of
f1=⋯=fn−1=0, then substituting them into
fn to obtain a univariate polynomial
gn(zn)=fn(α1,…,αn−1,zn).
If αn is a root of gn(zn), then we
have have extended the solution to (α1,…,αn)
of the original f.
The challenge is to extend this idea to compute clusters
of zeros of f from clusters of
zeros of f1=⋯=fn−1=0.
Moreover, we want to allow
the coefficients of each fi to be oracle numbers.
The use of oracle numbers allows us to treat polynomial
systems whose coefficients are algebraic numbers and beyond.
To compute clusters of roots of a univariate polynomial given as an
oracle, we rely on the recent algorithm described in
[4],
based on a predicate introduced in [5] that combines
Pellet’s theorem and Graeffe iterations to determine the number of
roots counted with multiplicities in a complex disc; this predicate is
called soft because it only requires the polynomial to be known
as approximations. It is used in a subdivision framework combined with
Newton iterations to achieve a near optimal complexity.
We implemented our algorithm and made it available
as the Julia
package Ccluster.jl .
Our experiments show that it advantageously compares
to major homotopy solvers
for solving random dense triangular systems
in terms of solving times and reliability
(i.e. getting the correct number of solutions
and the correct multiplicity structures).
Homotopy solving is more general because
it deals with any polynomial system.
We also propose experiments with
triangular systems obtained with elimination
procedures.
1.2 Related work
There is a vast literature on solving polynomial systems and we can only refer
to book surveys and references therein, see for instance
[13, 25]. On the algebraic side,
symbolic tools like Groebner basis, resultant, rational univariate
parametrization or triangularization,
find an equivalent triangular system or set
thus reducing the problem to the univariate case.
Being symbolic,
these methods handle all input, in particular with solutions with
multiplicities, and are certified but at the price of a high complexity that
limits their use in practice.
Implementations of hybrid symbolicnumeric solvers are available
for instance in Singular
via solve.lib
or in Maple via RootFinding[Isolate].
On the numerical side, one can find subdivision and homotopy methods. The main
advantage of subdivision methods is their locality: the practical complexity
depends on the size of the solving domain and the number of solutions in this
domain. Their main drawback is that they are only practical for low dimensional
systems. On the other hand, homotopy methods are efficient for high dimensional
systems, they are not local but solutions are computed independently from one
another.
Numerical methods only work for restricted classes of systems and the
certification of the output remains a challenge. Multiprecision arithmetic, interval
analysis, deflation and αtheory are now classical tools to address this
certification issue [15, 22, 6, 26].
In the univariate case, practical certified algorithms are now available for
real and complex solving that match the best known complexity bounds together
with efficient implementations [19, 17].
For the bivariate case, the problem of solving a triangular system can be seen
as a univariate isolation in an extension field. The most recent contributions
in this direction presenting algorithms together with complexity analysis are
[23, 24].
Only a few work address the specific problem of solving triangular polynomial
systems. The solving can then be performed coordinate by coordinate by
specialization and univariate solving in fibers. When the systems only have
regular solutions, extensions of classical univariate isolation algorithms to
polynomial with interval coefficients have been proposed
[9, 14, 7]. In the presence of multiple
solutions, one approach is to use a symbolic preprocessing to further decompose
the system in regular subsystems. Another approach is the sleeve method with
separation bounds [8]. The authors of [28]
propose a formula to compute the multiplicity of a solution of a triangular
system: the latter multiplicity is the product of the multiplicities of the
components of a solution in the fibers. Then, by using square free factorization
of univariate polynomials specialized in fibers, they describe an algorithm to
retrieve the real solutions of a triangular system with their multiplicities.
In [20], the method of Local Generic Position is adapted to the
special case of triangular systems with the advantage of only using resultant
computations (instead of Goebner basis), multiplicities are also computed.
1.3 Definitions and Notation
Convention for Vectors.
We introduce some general conventions for vectors that will
simplify the following development.
Vectors are indicated by bold fonts. If v=(v1,…,vn)
is an nvector, and i=1,…,n, then the ith component
vi is
denoted vi and
the ivector (v1,…,vi) is denoted v(i).
Thus v=(v(n−1),vn), and
“v=v(n)” is an idiomatic way
of saying that v is an nvector.
Because of the subscript convention, we
will superscripts such as v1,v2, etc, to distinguish
among a set of related nvectors.
Normed Vector Spaces.
In order to do error analysis, we need to treat
C[z] and Cn as normed vector spaces:
for f∈C[z] and b∈Cn,
let
∥f∥ and ∥b∥ denote the infinity norm
on polynomials and vectors, respectively.
We use the following perturbation convention:
let δ≥0. Then we will write
f±δ to denote some polynomial
˜f∈C[z] that satisfies ∥f−˜f∥≤δ.
Similarly, b±δ denotes some vector ˜b∈Cn
that satisfies ∥b−˜b∥≤δ.
If δ≤2−L then ˜b and ˜f
are called Lbit approximations
of b and f, respectively.
We define the degree sequence of f∈C[z]
to be d=d(f) where di is the degree of zi in f.
If b∈Ck (k=1,…,n), let
f(b) denote the polynomial that results
from the substitution
zi→bi (for i=1,…,k).
The result is a polynomial
f(b)∈C[zk+1,…,zn] called the
specialization of f by b.
Note that f(b) is a polynomial in at most n−k variables.
In particular, when n=k, then f(b) is a constant
(called the evaluation of f at b).
For instance, suppose b∈Cn,
then f(b(n−1)) is a polynomial in zn
and f(b(n−1))(bn)=f(b).
If B⊆C is a box
with center c and width w,
we denote by Δ(B) the disc with center c and
radius 34w. Note that Δ(B) contains B.
If B⊂Cn is a polybox,
let Δ(B) be the polydisc
where Δ(B)i=Δ(Bi).
Oracle Computational Model.
We use two kinds of numbers in our algorithms:
an explicit kind which is standard in computing,
and an implicit kind which we call “oracles”.
Our explicit numbers are dyadic numbers (i.e., bigFloats),
D:={n2m:n,m∈Z}. A pair (n,m)
of integers represents the nominal value of n2m∈D.
However, we also want this pair to represent the interval
[(n−12)2m,(n+12)2m]. To distinguish between them,
we write (n,m)0 for the nominal value, and (n,m)1
for the interval of width 2m.
Call (n,m)1 an Lbit dyadic interval
if m≤−L (so the interval has width at most 2−L).
Note that (2n,m)1 and (n,m+1)1 are different despite having the
same nominal value. As another example, note that (0,m)1
is the interval [−2m−1,2m−1].
When we say a box, disc, polybox, etc, is dyadic, it means that
all its parameters are given by dyadic numbers.
The set of closed intervals with dyadic endpoints is
denoted \framebox(4.0,7.0)D. Also, let \framebox(4.0,7.0)nD denote
ndimensional dyadic boxes.
The implicit numbers in our algorithms are functions: for any
real number x∈R, an oracle for x is a function
O:Z→\framebox(4.0,7.0)D such that Ox(L) is an Lbit
dyadic interval containing x. There is normally no confusion in
identifying the real number x with any
oracle function Ox for x. Moreover, we write (x)L
instead of Ox(L).
E.g., if x is a real algebraic number with defining polynomial
p∈Z[X] and isolating interval I, we may define
an oracle Ox=O(p,I) for x in a fairly standard way.
Next, an oracle Oz for a complex number z=x+iy
is a function Oz:Z→\framebox(4.0,7.0)2D such that
Oz(L)=Ox(L)+iOy(L) where
Ox,Oy are oracles for x and y. Again, we may
identify z with any oracle Oz, and write (z)L
instead of Oz(L). For polynomials f∈C[z(n)]
in n≥2 variables, we assume a sparse representation,
f=∑α∈Supp(f)fαzα
with fixed support Supp(f)⊆Nn,
with coefficients fα∈C∖{0}, and
zα:=∏ni=1zαii
are power products.
An oracle Of for f amounts to having oracles for
each coefficient fα of f. Moreover Of(L)
may be written (f)L
is the interval polynomial whose coefficients are (fα)L.
Call (f)L a dyadic interval polynomial.
1.4 Oracles for Root Cluster of Univariate Polynomials
The starting point for this paper is the fundamental result
that the Local Clustering Problem (LCP) has been
solved in the univariate setting:
Proposition 1 (See [4, 5])
There is an algorithm Cluster(f,B,ϵ)
that solves the Local Clustering Problem
when f:C→C is a univariate oracle polynomial.
In other words, the output of
Cluster(f,B,ϵ) is a set {(Δi,mi):i=1,…,k}
such that each Δi is a natural ϵisolator, and

Zero(B,f)⊆k⋃i=1Zero(Δi,f)⊆Zero(2B,f). 

To make this result the basis of our multivariate
clustering algorithm, we need to generalize this result.
In particular, we need to be able to further
refine each output (Δi,mi) of this algorithm.
If (Δi,mi) represents the cluster Ci of roots,
we want to get better approximation of Ci, i.e.,
we want to treat Ci like number oracles.
Fortunately, the algorithm in [4, 5]
already contains the tools to do this. What is lacking is
a conceptual framework to capture this.
Our goal is to extend the concept of number oracles
to “cluster oracles”. To support the several modifications
which are needed, we revise our previous view of “oracles as
functions”. We now think of an oracle
O as a computational object with state information,
and which can transform itself in order to
update its state information.
For any L∈Z, recall that O(L) is
a dyadic object that is at least Lbit accurate.
E.g., if O is the oracle for x∈R,
O(L) is an interval containing x of width ≤2−L.
But now, we say that oracle is transformed to a
new oracle which we shall denote by “(O)L”
whose state information is O(L). In general,
let σ(O) denote the state information in O.
Next,
for a cluster C⊆C of roots of a univariable polynomial
p(z)∈C[z], its oracle OC has state σ(OC)
that is a pair (Δ,m) where Δ⊆C is a dyadic disc
satisfying C=Zero(Δ,p)=Zero(3Δ,p)
and m is the total multiplicity of C.
Thus C is automatically a natural cluster.
We say OC is Lbit accurate if the radius
of Δ is at most 2−L.
Intuitively, we expect (OC)L to be
an oracle for C that is Lbit accurate.
Unfortunately, this may be impossible unless C is a singleton
cluster. In general, we may have to split C into two
or more clusters. We therefore need one more extension:
the map L↦(OC)L returns a set

{OC1,…,OCk},(for some k≥1) 

of cluster oracles with the property that
C=∪ki=1Ci (union of multisets), and
each OCi is Lbit accurate.
We generalize Proposition 1 so that it outputs
a collection of cluster oracles:
Proposition 2 (See [4, 5, 17])
Let Of be an oracle for a univariate polynomial
f:C→C.
There is an algorithm ClusterOracle(Of,B,L)
that returns a set {OCi:i=1,…,k}
of cluster oracles such that

Zero(B,f)⊆k⋃i=1Ci⊆Zero(2B,f). 

and each OCi is Lbit accurate.
3 Error Analysis of Approximate Specializations
The proofs for this section is found in the Appendix.
Given f,˜f∈C[z]=C[z(n)] and b,˜b∈Cn,
our basic goal is to bound the evaluation error
in terms of
δf:=∥f−˜f∥
and
δb:=∥b−˜b∥.
This will be done by induction on n.
Our analysis aims not just
to produce some error bound, but to express this error
in terms that are easily understood, and which reveals
the underlying inductive structure.
Towards this end, we introduce the following βbound
function: if d is a positive integer and b∈C,
Let d=d(f),
i.e., di=degzi(f) for each i.
The support of f is Supp(f)⊆Nn where
f=∑α∈Supp(f)cαzα
where cα∈C∖{0}. Here,
zα:=∏ni=1zαii.
We assume that Supp(˜f)⊆Supp(f).
Our induction variable is k=1,…,n.
For α∈Nn, let
πk(α):=(0,…,0,αk+1,…,αn).
E.g., if k=n then πk(α)=0.
Thus α−πk(α)=(α1,…,αk,0,…,0).
Next define
Suppk(f):={πk(α):α∈Supp(f)}.
With this notation, we can write
where each fα∈C[z(k)].
E.g., if k=n then Suppk(f)={0} and so f0=f.
Assume that we are given f,˜f∈C[z]=C[z(n)]
and b,˜b∈C. Also the degree sequences satisfies
d(˜f)≤d(f), that is the inequality holds
componentwise. Then we may define these quantities
for k=1,…,n:

δkb:=bk−˜bk,δkf:=∥f(b(k))−˜f(˜b(k))∥(with δ0f=∥f−˜f∥),βk:=β(dk,bk)˜βk:=β(dk, 
Comments
There are no comments yet.