Dynamical perturbation theory for eigenvalue problems

02/28/2020
by   Maseim Kenmoe, et al.
0

Many problems in physics, chemistry and other fields are perturbative in nature, i.e. differ only slightly from related problems with known solutions. Prominent among these is the eigenvalue perturbation problem, wherein one seeks the eigenvectors and eigenvalues of a matrix with small off-diagonal elements. Here we introduce a novel iterative algorithm to compute these eigenpairs based on a reformulation of the eigenvalue problem as an algebraic equation in complex projective space. We show from explicit and numerical examples that our algorithm outperforms the usual Rayleigh-Schrödinger expansion on three counts. First, since it is not defined as a power series, its domain of convergence is not a priori confined to a disk in the complex plane; we find that it indeed usually extends beyond the standard perturbative radius of convergence. Second, it converges at a faster slower rate than the Rayleigh-Schrödinger expansion, i.e. fewer iterations are required to reach a given precision. Third, the (time- and space-) algorithmic complexity of each iteration does not increase with the order of the approximation, allowing for higher precision computations. Because this complexity is merely that of matrix multiplication, our dynamical scheme also scales better with the size of the matrix than general-purpose eigenvalue routines such as the shifted QR or divide-and-conquer algorithms. Whether they are dense, sparse, symmetric or unsymmetric, we confirm that dynamical diagonalization quickly outpaces LAPACK drivers as the size of matrices grows; for the computation of just the dominant eigenvector, our method converges order of magnitudes faster than the Arnoldi algorithm implemented in ARPACK.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 25

page 26

12/29/2020

A fast iterative algorithm for near-diagonal eigenvalue problems

We introduce a novel iterative eigenvalue algorithm for near-diagonal ma...
04/16/2020

Convergence of Eigenvector Continuation

Eigenvector continuation is a computational method that finds the extrem...
09/22/2021

Simple exponential acceleration of the power iteration algorithm

Many real-world problems rely on finding eigenvalues and eigenvectors of...
01/06/2022

Fast Toeplitz eigenvalue computations, joining interpolation-extrapolation matrix-less algorithms and simple-loop theory

Under appropriate technical assumptions, the simple-loop theory allows t...
02/12/2020

Fast computation of optimal damping parameters for linear vibrational systems

We formulate the quadratic eigenvalue problem underlying the mathematica...
04/08/2021

Fast optimization of viscosities for frequency-weighted damping of second-order systems

We consider frequency-weighted damping optimization for vibrating system...
08/14/2017

Eigenvalue Solvers for Modeling Nuclear Reactors on Leadership Class Machines

Three complementary methods have been implemented in the code Denovo tha...

Code Repositories

dynamical-PT

Mathematica code for the paper [Kenmoe, Smerlak, Zadorin. 2020. Dynamical perturbation theory for eigenvalue problems]


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Computing the eigenvalues and eigenvectors of a matrix that is already close to being diagonal (or diagonalizable in a known basis) is a central problem in many branches of applied mathematics. The standard approach, outlined a century ago by Rayleigh rayleigh1894 and Schrödinger Schr_dinger_1926, consists in assuming a power series in a perturbation parameter  and extracting the coefficients order by order from the eigenvalue equation. Historically, this technique played a central role in the development of quantum mechanics, providing analytical insight into e.g. level splitting in external fields Schr_dinger_1926, radiative corrections Bethe_1947 or localization in disordered systems Anderson_1958. Suitably adapted to the problem at hand M_ller_1934; Epstein_1926; 1955, Rayleigh-Schrödinger (RS) perturbation theory remains the workhorse of many ab initio calculations in quantum many-body theory ostlund1989 and quantum field theory itzykson2006. Outside the physical sciences, RS perturbation theory underlies the quasispecies theory of molecular evolution Eigen_1988, among countless other applications.

The shortcomings of the RS series are well known. First, its explicit form requires that all unperturbed eigenvalues be simple, else the perturbation must first be diagonalized in the degenerate eigenspace through some other method. Second, the number of terms to evaluate grows with the order of the approximation; to keep track of them, cumbersome diagrammatic techniques are often required. Third, the RS series may not converge for all (or, in infinite dimensions, any) values of 

 Dyson_1952; Kato_1966; Simon_1991. Indeed, since the RS series is a power series, its domain of convergence is confined to a disk in the complex -plane: if a singularity exists at location , then the RS series will necessarily diverge for any value  such that 

. This means that a non-physical (e.g. imaginary) singularity can contaminate the RS expansion, turning it into a divergent series in regions where the spectrum is, in fact, analytic. Of course, divergent power series can still be useful: the partial sums often approaches the solution rather closely before blowing up; what is more, a wealth of resummation techniques, such as Borel resummation or Padé approximants, can be used to extract convergent approximations from the coefficients 

kleinert2007.

In this paper we introduce an explicit perturbation scheme for finite-dimensional (Hermitian or non-Hermitian) eigenvalue problems alternative to the RS expansion. Instead of assuming a power series in , we build a sequence of approximations of the eigenpairs by iterating a non-linear map in matrix space, i.e. through a discrete-time dynamical system. As we will see, this alternative formulation of perturbation theory compares favorably with the conventional RS series—which can be recovered through truncation—with regards to the second (complexity) and third (convergence) points above. Our novel scheme is also surprisingly fast from a computational viewpoint, offering an efficient alternative to standard diagonalization algorithms for large perturbative matrices.

Our presentation combines theoretical results with numerical evidence. We prove that (i) our dynamical scheme reduces to the RS partial sum when truncated to , and (ii) there exists a neighborhood of  where it converges exponentially to the full set of perturbed eigenvectors. The latter result is analogous to Kato’s lower bound on the radius of convergence of RS perturbation theory Kato_1966; like the latter, it provides sufficient but not necessary conditions for convergence at finite . To illustrate the performance of our algorithm, we consider several examples of perturbative matrix families, including a quantum harmonic oscillator perturbed by a -potential and various ensembles of (dense and sparse) random perturbations.

Figure 1: Convergence of perturbation theory in the two-dimensional example (5). Left: In the complex -plane, RS perturbation theory (RS-PT) converges inside a circle of radius  (orange line) bounded by the exceptional points  where eigenvalues have branch-point singularities and  is not diagonalizable. Dynamical perturbation theory (D-PT) converges inside the domain bounded by the blue cardioid, which is larger—especially along the real axis, where there is no singularity. Outside this domain, the map can converge to a periodic cycle, be chaotic or diverge to infinity, following flip bifurcations (along the real axis) and fold bifurcations (at the singularities). The domain where the map remains bounded (black area) is a conformal transformation of the Mandelbrot set. Right: The bifurcation diagram for the quadratic map  along the real -axis illustrates the period-doubling route to chaos as  increases away from  (in absolute value). Orange and left vertical lines indicate the boundary of the convergence domains of RS-PT and D-PT respectively. 

Ii Results

ii.1 Eigenvectors as affine variety

We begin with the observation that, being defined up to a multiplicative constant, the eigenvectors of an  matrix  are naturally viewed as elements of the complex projective space . Let  denote the homogeneous coordinates of . Fix an index , and consider an eigenvector  of  such that . From the eigenvalue equation  we can extract the eigenvalue as ; inserting this back into  we find that  is a projective root of the system of  homogeneous quadratic equations

Denoting  the projective variety defined by these equations and  the affine chart in which the -th homogeneous coordinate is nonzero, we arrive at the conclusion that the set of eigenvectors of  is the affine variety , see SI.1 for details. Since eigenvectors generically have non-zero coordinates in all directions, each one of  normally contains the complete set of eigenvectors.

ii.2 Eigenvectors as fixed points

We now further assume a perturbative partioning  where the diagonal part  consists of unperturbed eigenvalues  and  is a perturbation parametrized by . Provided the ’s are all simple (non-degenerate), we can rewrite the polynomial system above as

where . Within the chart  we are free to set ; this results in the fixed point equation  with  the map with components

(1)

where the matrix  is completed with  and are the components of the unit matrix .

As noted above, each one of these maps  generically has the full set of eigenvectors of  as fixed points. Here, however, we wish to compute these fixed points dynamically, as limits of the iterated map

For this procedure to converge, the map  must be contracting in the neighborhood of at least one fixed point; for small  this is always the case for the fixed point closest to the unperturbed eigenvector , but usually not for the other fixed points. To compute all the eigenvectors of , we must therefore iterate not one, but all the maps  together, ideally in parallel.

We can achieve this by bundling all candidate eigenvectors in a list and applying the map to the -th component:  

Introducing the matrix  whose 

-th line is the vector 

, this can be written in matrix form as

(2)

Here prime denotes the transpose,  the Hadamard (element-wise) product of matrices, and . The dynamical system in the title is then obtained through iterated applications of 

to the identity matrix of unperturbed eigenvectors 

, generating the sequence of approximating eigenvectors

(3)

The eigenvalues  at order  are given by .

ii.3 Convergence at small 

It is not hard show that  always converges at small . Let 

denote the spectral norm of matrices (the largest singular value), which is sub-multiplicative with respect to both the matrix and Hadamard products 

r2012, and assume  without loss of generality. First,  implies that  maps a closed ball  of radius  centered on  onto itself whenever .  Next, from (2) we have

Hence,  is contracting in  provided . Under these conditions the Banach fixed-point theorem implies that  converges exponentially to a unique fixed point within this ball as . Choosing the optimal radius

we see that  guarantees convergence to the fixed point.

ii.4 Contrast with RS perturbation theory

Let us now contrast this iterative method with conventional RS perturbation theory. There, the eigenvectors of  are sought as power series in viz. . Identifying the terms of the same order in  in the eigenvalue equation, one obtains for the matrix of eigenvectors at order 

in which the matrix coefficients  are obtained from  via the recursion (SI.2)

(4)

We show in SI.3 that the dynamical scheme  completely contains this RS series, in the sense that  For example, to order  we have

and

This means that we can recover the usual perturbative expansion of  to order  by iterating  times the map  and dropping all terms . Moreover, the parameter whose smallness determines the convergence of the RS series is the product of the perturbation magnitude  with the inverse diagonal gaps  Kato_1966, just as it determines the contraction property of .

These similarities notwithstanding, the dynamical scheme differs from the RS series in two key ways. First, the complexity of each iteration is constant (one matrix product with  and one Hadamard product with , which is faster), whereas computing the RS coefficients  involves the sum of increasingly many matrix products. Second, not being defined as a power series, the convergence of  when  is not a priori restricted to a disk in the complex -plane. Together, these two differences suggest that our dynamical scheme has the potential to converge faster, and in a larger domain, than RS perturbation theory. This is what we now examine, starting from an elementary but explicit example.

ii.5 An explicit example

Consider the symmetric matrix

(5)

This matrix has eigenvalues , both of which are analytic inside the disk but have branch-point singularities at . (These singularities are in fact exceptional points, i.e. is not diagonalizable for these values.) Because the RS series is a power series, these imaginary points contaminate its convergence also on the real axis, where no singularity exists: diverges for any value of outside the disk of radius , and in particular for real .

Considering instead our iterative scheme, one easily computes

where and the superscripts indicate -fold iterates. This one-dimensional map has two fixed points at . Of these two fixed points is always unstable, while is stable for and loses its stability at in a flip bifurcation. At yet larger values of , the iterated map —hence the fixed-point iterations —follows the period-doubling route to chaos familiar from the logistic map May_1976. For values of along the imaginary axis, we find that the map is stable if and loses stability in a fold bifurcation at the exceptional points . The full domain of convergence of the system is strictly larger than the RS disk, as shown in Fig. 1. Further details on the algebraic curve bounding this domain are given in SI.4-.5.

We also observe that the disk where both schemes converge, the dynamical scheme does so with a better rate than RS perturbation theory: we check that , while the remainder of the RS decays as . This is a third way in which the dynamical scheme outperforms the usual RS series, at least in this case: not only is each iteration computationally cheaper, but the number of iterations required to reach a given precision is lower.

ii.6 Quantum harmonic oscillator with -potential

As a second illustration let us consider the Hamiltonian of a quantum harmonic oscillator perturbed by a -potential at the origin,

(6)

where the unperturbed Hamiltonian (in brackets) is diagonal in the basis of Hermite functions  with eigenvalues . For this example familiar from elementary quantum mechanics Viana_Gomes_2011

, it was estimated in 

Kvaal_2011 that the RS series converges for all 

. Since (by parity) the perturbation only affects the even-number eigenfunctions, we consider the matrices 

and  with  truncated at some value .

Fig. 2 compares the convergence of the matrix of eigenvectors  with the order  under the RS and dynamical schemes. While convergence is manifestly  exponential in both cases, the rate is faster in dynamical perturbation theory, particularly at larger values of . For instance, for  it takes over  orders in conventional perturbation theory to reach the same precision as  iterations of the map ; when  the RS series no longer converges, but the orbits of 2 still do. In the limit where  (the complete quantum mechanical problem) we found that the RS series has a radius of convergence , whereas dynamical perturbation theory converges up to .

Figure 2: Convergence of the RS perturbative series (RS-PT, orange symbols) and of our dynamical scheme (D-PT, blue symbols) for the perturbed harmonic oscillator in (6) with  and increasing values of . The latter converges at a faster rate, and in a larger domain, than the RS series. 

ii.7 An improved perturbation theory

These improved convergence properties are not restricted to the matrix families 5 or 6. To see this, let us now consider an ensemble of (nonsymmetric) random matrices of the form  where  are 

arrays of random numbers uniformly distributed between 

and . For given  and 

, we compute the success rate (probability of convergence) of each method, and, in cases where both converge, the orders 

and  required by each to converge at hundred-digit precision.

Fig. 3 plots the difference  for matrices of size  at various values of . In a large majority of cases (out of  samples for each ), the difference turns out positive, implying that dynamical perturbation converges as a faster rate than RS perturbation theory. The success rate (inset) is also better with the former, confirming the patterns observed in the  and -potential examples.

Figure 3: Comparison of the convergence rates of dynamical perturbation theory (D-PT) and RS perturbation theory (RS-PT) for random matrices with , equally spaced diagonal entries and uniformly distributed off-diagonal elements. A positive value of  indicates that D-PT reaches hundred-digit precision for the full set of eigenvectors with fewer iterations that RS-PT; this is true in a large majority of cases, though not always. The inset shows that D-PT is also more likely to converge (for a given ) than RS-PT. 

ii.8 A subcubic eigenvalue algorithm

We noted earlier that the algorithmic complexity of each iteration of the map is limited by that of matrix multiplication by . The complexity of matrix multiplication is theoretically  Coppersmith_1990; Le_Gall_2014; the Strassen algorithm Strassen_1969 often used in practice performs in  or better for sparse matrices. This implies that, within its domain of convergence (i.e. for matrices with small off-diagonal elements), our dynamical scheme scales better than classical diagonalization algorithms such as shifted Hessenberg QR (for nonsymmetric matrices) or divide-and-conquer (for Hermitian matrices), which are all  if all eigenpairs are requested Demmel_1997.

We tested this premise by comparing the timing of the complete dynamical diagonalization of large matrices with general-purpose LAPACK routines d1999, implemented in Mathematica 12 in the function Eigensystem. To this aim we considered two classes of random perturbations of the harmonic oscillator in (6): dense nonsymmetric matrices with uniformly distributed entries, and sparse Laplacian (hence symmetric) matrices of critical Erdős-Rényi random graphs (i.e. with  vertices and  edges). The perturbation parameter was set to  and timings were made both at machine precision and at fixed precision ( digits). Finally, we used a partioning of the matrices in which  has vanishing diagonal elements; this partioning is known in quantum chemistry as Epstein-Nesbet perturbation theory Epstein_1926; 1955. In each case dynamical perturbation theory proved (much) faster than LAPACK routines at large .

Figure 4: Performance of perturbative (dynamical, D-PT, and Rayleigh-Schrödinger, RS-PT) diagonalization vs. the DSYEVR LAPACK routine (implemented as Eigensystem in Mathematica 12) for sparse symmetric matrices, given as ratios of CPU time to matrix multiplication time (MMT). Unlike standard cubic algorithms, the complexity of our dynamical scheme is just that matrix multiplication.  As a result, D-PT quickly outpaces LAPACK at both machine (left) or—even more impressively—at fixed precision (right). The values given are averages over  repetitions with a time constraint of one hour. 

Representative results are given in Fig. 4, where we plot the ratio of the CPU time required to compute all the eigenvectors of sparse  to the CPU time to compute its square  (matrix multiplication time MMT). While this ratio increases sharply with  with standard algorithms, it does not with our iterative algorithm. As a result, dynamical perturbation theory quickly outperforms LAPACK (here the DSYEVR routine for symmetric matrices), and especially its fixed precision version in Mathematica. The large improvement is remarkable because symmetric eigenvalue solvers are among the best understood, and most fine-tuned, algorithms in numerical mathematics Golub_2000. Our implementation of (2), on the other hand, was written entirely in Mathematica, i.e. without the benefit of compiled languages.

Finally we considered the computation of just the dominant eigenvector of  (the one with the largest eigenvalue). In contrast with the complete eigenproblem, subcubic algorithms exist for finding such extremal eigenvectors, such as the Arnoldi and Lanczos algorithms or Rayleigh quotient iteration Demmel_1997. Thanks to their iterative nature—similar to our dynamical scheme—these algorithms perform especially well with sparse matrices, and are generally viewed as setting a high bar for computational efficiency. Yet, for sparse perturbative matrices (such that the dominant eigenvector  is the one associated with the largest unperturbed eigenvalue ), we found that the iteration of the single-vector map  converges orders of magnitudes faster than these iterative algorithms (Fig. 5). For instance, it took just  iterations and  seconds on a desktop machine to compute at machine precision the dominant eigenpair of a sparse symmetric matrix with  non-zero elements ( Go in memory); by contrast ARPACK’s version of the Lanczos algorithm had not converged within a s cutoff for a matrix  times smaller.

Figure 5: CPU time to compute the dominant eigenvector of sparse perturbative matrices at machine precision with the Lanczos algorithm in Mathematica’s ARPACK routine (with starting vector provided), with the single-vector map (1) (D-PT), and with Reyleigh quotient iteration (RQI). Dynamical perturbation theory is much faster, even in our non-compiled Mathematica implementation. 

Overall, these timings suggest that, although certainly not a general-purpose method due to its limited convergence domain, dynamical diagonalization can be an efficient approach to the numerical eigenvalue problem, whether all eigenpairs are needed (in which case it is pleasantly parallel) or just a few. When the matrix of interest has off-diagonal elements which are not small, it might still be possible to compute approximate eigenvectors through another method (say to  accuracy) and improve the performance by using dynamical diagonalization for the final convergent steps.

Iii Discussion

We reconsidered the classic problem of computing the eigensystem of a matrix given in diagonal form plus a small perturbation. Departing from the usual prescription based on power series expansions, we identified a quadratic map in matrix space whose fixed points are the sought-for eigenvectors, and proposed to compute them through fixed-point iteration, i.e. as a discrete-time  dynamical system. We showed empirically that this algorithm outperforms the usual RS series on three counts: each iteration has lower computational complexity, the rate of convergence is usually higher (so that fewer iterations are needed to reach a given precision), and the domain of convergence of the scheme (with respect to the perturbation parameter ) is usually larger. Within its domain of convergence, our algorithm runs much faster than standard exact diagonalization routines because its complexity is merely that of matrix multiplication (by the perturbation ).

Several extensions of our method can be considered. For instance, when  is too large and the dynamical system does not converge, we may reduce it to  for some integer , diagonalize the matrix with this smaller perturbation, restart the algorithm using the diagonal matrix thus obtained as initial condition, and repeat the operation  times. This approach is similar to the homotopy continuation method for finding polynomial roots Morgan_2009 and can effectively extend the domain of convergence of the dynamical scheme. Another avenue is to leverage the geometric structure outlined in I, for instance by using charts on the complex projective space other than , which would lead to different maps with different convergence properties. A third possibility is to use the freedom in choosing the diagonal matrix —e.g. by optimizing the product of norms —to construct maps with larger convergence domains, a trick which is known to sometimes improve the convergence of the RS series Remez_2018; Szabados_1999.

Finally, it would be desirable to obtain sharper theoretical results on the convergence and speed of our iterative diagonalization algorithm. This can be difficult: the bifurcations of discrete dynamical systems are notoriously hard to study in dimensions greater than one, and estimating convergence rates requires evaluating the Jacobian of the map at its fixed points—which by definition we do not know a priori. It is perhaps worth recalling, however, that estimates of radii of convergence for the RS series were only obtained several decades after the method was widely applied in continuum and quantum mechanics Kato_1966, and that their computation in practice remains an area of active research Kvaal_2011. We leave this challenge for future work.

Acknowledgements.
We thank Rostislav Matveev and Alexander Heaton for useful discussions. Funding for this work was provided by the Alexander von Humboldt Foundation in the framework of the Sofja Kovalevskaja Award endowed by the German Federal Ministry of Education and Research.

References

Supplementary Material

.1 Eigenvectors and affine charts

Let us start with the system of homogeneous polynomial equations

for all and . The projective variety defined by this full system is the union of the eigenvectors of seen as projective points. In other words, the full system for all and (without assumptions ) is equivalent to the initial eigenvector problem: it is the statement that the exterior product of with vanishes, which is equivalent to .

Assume that belongs to a perturbative one-parameter family . Fix some and consider the subset of equations that corresponds only to this and to all ( equations in total). In the affine chart defined by , this subsystem becomes with given by (1). The fixed points of this dynamical system are in one-to-one correspondence with the eigenvectors of visible in this chart.

Indeed, we need to prove that for the two following propositions are equivalent: is a root of and . The direction is proven in the main text. The other direction is easy too. From and , we conclude that for all . If we now denote the common factor in these expressions as , then immediately follows.

It should be noted, however, that the correspondence holds only in the chart . The system (for the fixed , as before) may have other solutions in , but all of them by necessity are at infinity in . These solutions are not related to the eigenvectors of . Indeed, consider in the form

(7)

for some fixed and set to find solutions in that are at infinity in . Clearly, any with that obeys is a solution of (7). Thus, in general, there is a whole ()-dimensional complex projective subspace of such solutions in (corresponding to a (

)-dimensional complex hyperplane in the affine space

).

Finally, at there is only one solution in each chart for each corresponding system , namely the origin of the chart (which represents the corresponding unperturbed eigenvector), while for generic value of , and there are all solutions in each chart for its corresponding system of equations. As soon as becomes different from , the other solutions appear in the chart from infinity.

.2 Rayleigh-Schrödinger perturbation theory

Here we recall the derivation of the Rayleigh-Schrödinger recursion (4), given e.g. in Roth_2010. Let and respectively be the -th eigenvalue and eigenvector of corresponding to the unperturbed eigenvalues and eigenvector , chosen so that and for all (a choice known as ‘‘intermediate normalization’’). We start by expanding these in powers of :

Substituting these ansätze into the eigenvalue equation and making use of the Cauchy product formula, this yields at zeroth order (hence ) and for

It is convenient to expand in the basis of the eigenstates of as with and for . This gives for each and

The equation for the eigenvalues correction is extracted by setting in this equation and using . This leads to Injecting this back into the equation above we arrive at

This is (4) in the main text.

.3 Dynamical perturbation theory contains the RS series

We prove by induction. Obviously . Suppose that or, more specifically,

where the matrices are from the expansion (4). Then from (2) we have

From this expression it is easy to see that the term of -th order in in is given by terms with , viz.

This term matches exactly the RS coefficient as given by (4). This concludes the proof.

.4 Convergence domain of the dynamical perturbation theory

Consider a map for some fixed . An attracting equilibrium point of the corresponding dynamical system loses its stability when the Jacobian matrix of has an eigenvalue (or multiplier) with absolute value equal to at this point.

Consider the system of polynomial equations of complex variables ( for , , and )

The variable here plays the role of a multiplier of a steady state. Either by successively computing resultants or by constructing a Groebner basis with the correct lexicographical order, one can exclude the variables from this system, which results in a single polynomial of two variables . This polynomial defines a complex 1-dimensional variety. The projection to the -plane of the real 1-dimensional variety defined by corresponds to some curve . A more informative way is to represent this curve as a complex function of a real variable implicitly defined by .

The curve is the locus where a fixed point of have a multiplier on the unit circle. In particular, the fixed point that at corresponds to and , , loses its stability along a particular subset of this curve. The convergence domain of the dynamical perturbation theory is the domain that is bounded by these parts of the curve and that contains .

is a smooth curve with cusps (return points), which correspond to the values of such that has a nontrivial Jordan form (is non-diagonalizable). In a typical case, all cusps a related to the merging of a pair of eigenvectors of . For the dynamical system (1), the cusps, thus, correspond to the fold bifurcations of its steady states DOLOTIN_2008. One of the multipliers equals to at such merged point Kuznetsov_2004, so these values of can be found as a subset of the roots of the univariate polynomial . Not all its roots generally correspond to cusps and fold bifurcations, though, as we will see later. On the other hand, all the cusps/fold bifurcation points are among the -roots of the discriminant polynomial . These roots may correspond to both geometric (when two eigenvectors merge and acquires nontrivial Jordan form) and algebraic (when the eigenvalues become equal but retains trivial Jordan form) degeneration of . Algebraic degenerations are not related to cusps. Thus, the cusp points of can be found as the intersection of the root sets of the two polynomials.

The convergence domain of the dynamical system (2) for the whole matrix of the eigenvectors is equal to the intersection of the convergence domains for its individual lines (1).

The explicit example of the main text results in the curve given by for each line of the matrix . The cusps of this curve are at . In this particular case, the convergence circle of the RS perturbation theory is completely contained in the convergence domain of the dynamical perturbation theory, and their boundaries intersect only at the cusp points.

This convergence domain is directly related to the main cardioid of the classical Mandelbrot set: the set of complex values of the parameter that lead to a bound trajectory of the classical quadratic (holomorphic) dynamical system . The main cardioid of the Mandelbort set (the domain of stability of a steady state) is bounded by the curve . The boundary of the stability domain of our example is simply a conformal transform of this cardioid by two complementary branches of the square root function composed with the sign inversion. The origin of this relation becomes obvious after the parameter change followed by the variable change . This brings the classical system to the dynamical system of the only nontrivial component of the first line for our example: . The nontrivial component of the second line follows an equivalent (up to the sign change of the variable) equation: .

.5 Explicit examples

The example in the main text is very special. In fact, any 2-dimensional case is special in the following sense. The iterative approximating sequence for for any and takes the form

where are univariate quadratic polynomials related by , with . The first special feature of this recursion scheme is that it is equivalent to a 1-dimensional quadratic discrete-time dynamical system for each line of . This implies that the only critical point of either (, when the diagonal elements of are equal) is necessarily attracted by at most unique stable fixed point. The second special feature is fact that both lines have exactly the same convergence domains in the -plane. To see this, suppose that and are the roots of . Then it follows that and are the roots of . As , and (like for any quadratic polynomial) , we also see that , and thus . Therefore, the fixed point of the dynamical systems defined by corresponding to an eigenvector and the fixed point of corresponding to the different eigenvector are stable or unstable simultaneously.

These two properties are not generic when . Therefore, we provide another explicit example of a matrix to foster some intuition for more general cases:

The polynomial that defines the fixed point degeneration curve (see section LABEL:981484) here takes the form for

(8)

for

(9)

and for

(10)

As we can see, the dynamical systems for all three lines of () have different domains of convergence in the plane. The corresponding curves are depicted on Fig. S1, Fig. S2, and Fig. S3, respectively.

There are differences also in curves for individuals lines with those for the case. Note that a curve does not contain enough information to find the convergence domain itself. The domains on Fig. S1S3 were found empirically. Of course, they are bound by some parts of and include the point . The reason for some parts of not forming the boundary of the domain is that its different parts correspond to different eigenvectors. In other words, they belong to different branches of a multivalued eigenvector function of , the cusps being the branching points.

Consider as a particular example the case . The curve intersects itself at . Above the real axis about this point, one of the two intersecting branches of the curve form the convergence boundary. Below the real axis, the other one takes its place. This indicates that the two branches correspond to different multipliers of the same fixed point. When , one of them crosses the unitary circle at the boundary of the convergence domain; when , the other one does. At , both of them cross the unitary circle simultaneously. This situation corresponds, thus, to a Neimark-Sacker bifurcation (the discrete time analog of the Andronov-Hopf bifurcation). Both branches are in fact parts of the same continuous curve that passes through the point . Around this point, the curve poses no problem to the convergence of the dynamical system. The reason for this is that an excursion around cusps (branching points of eigenvectors) permutes some eigenvectors. As a consequence, the curve at corresponds to the loss of stability of the eigenvector that is a continuation of the unique stable eigenvector at by the path . At the same time, the same curve at indicates a unitary by absolute value multiplier of an eigenvector that is not a continuation of the initial one by the path .

Not all features of the curves depicted this case are generic either. The particular symmetry with respect to the complex conjugation of the curve and of its cusps is not generic for general complex matrices and , but it is a generic feature of matrices with real components. In this particular case, due to this symmetry, the only possible bifurcations for real values of are the flip bifurcations (a multiplier equals to , typically followed by the cycle doubling cascade), the Neimark-Sacker bifurcation (two multipliers assume complex conjugate values for some ), and, if the matrices are not symmetric, the fold bifurcation (a multiplier is equal to ). With symmetric real matrices, the fold bifurcation is not encountered because the cusps cannot be real but instead form complex conjugated pairs. These features are the consequence of the behavior of with respect to complex conjugation and from the fact that symmetric real matrices cannot have nontrivial Jordan forms.

Likewise, Hermitian matrices result in complex conjugate nonreal cusp pairs, but the curve itself is not necessarily symmetric. As a result, there are many more ways for a steady state to lose its stability, from which the fold bifurcation is, however, excluded. Generic bifurcations at here consist in a multiplier getting a value for some . The situation for general complex matrices lacks any symmetry at all. Here steady states lose their stability by a multiplier crossing the unit circle with any value of , and thus the fold bifurcation, although possible, is not generic. It is generic for one-parameter (in addition to ) families of matrices.

As already noted, for the holomorphic dynamics of any case the unique critical point is guaranteed to be attracted by the unique attracting periodic orbit, if the latter exists. This, in turn, guarantees that for any with zero diagonal the iteration of (2) starting from the identity matrix converges to the needed solution provided that is in the convergence domain. This is not true anymore for , starting already from the fact that there are no discrete critical points in larger dimensions. The problem of finding a good initial condition becomes non-trivial. As can be seen on Fig. S2, the particular case encounters this problem for the second line (). The naive iteration with does not converge to the existing attracting fixed point of the dynamical system near some boundaries of its convergence domain. Our current understanding of this phenomenon is the crossing of the initial point by the attraction basin boundary (in the -space). This boundary is generally fractal. Perhaps this explains the eroded appearance of the empirical convergence domain of the autonomous iteration.

To somewhat mitigate this complication, we applied a nonautonomous iteration scheme in the form, omitting details, with , where is some positive number , so that , and we explicitly indicated the dependence of on . The idea of this ad hoc approach is the continuation of the steady state in the extended -phase space from values of that put inside the convergence domain of that steady state. Doing so, we managed to empirically recover the theoretical convergence domain (see Fig. S2).

Finally, we would like to point out an interesting generic occurrence of a unitary multiplier without the fold bifurcation. For , this situation takes place at , for at , and for at . All three points are on the real axis, as is expected from the symmetry considerations above. There is no cusp at these points and no fold bifurcations (no merging of eigenvectors), as it should be for symmetric real matrices. Instead, another multiplier of the same fixed point goes to infinity at the same value of (the point becomes super-repelling). As a result, the theorem of the reduction to the central manifold is not applicable.

Figure S1: The convergence domain on the -plane for the first line of  (the first eigenvector ) for the  example. The Mandelbrot-like set (domain where orbits remain bounded) of the iterative scheme is shown in black and grey. The empirical convergence domain is shown in black. Its largest component corresponds to the stability of a steady state (the applicability domain of the iterative method). Small components correspond to stability of various periodic orbits. Various shades of grey show the values of that lead to divergence to infinity (the darker the slower the divergence). In red are the values of where the matrix is non-diagonalizable.
Figure S2: Same as Fig. S1 for the second eigenvector  (the second line of ). Small components correspond to stability of various periodic orbits. Various shades of grey show the values of  that lead to divergence to infinity (the darker the slower the divergence).
Figure S3: Same as Fig. S1 for the third eigenvector  (the third line of ) .