## 1. Introduction

A theorem of von Neuman and Wigner states that, generically, a two-parameter family of real symmetric matrices has multiple eigenvalues at isolated points [22]. In other words, the matrices with multiple eigenvalues have co-dimension 2 in the manifold of real symmetric matrices [1, Appendix 10]. In this paper, we would like to address the problem of locating these isolated points of eigenvalue multiplicity in the 2-dimensional parameter space. To be more precise, we consider the following problem.

###### Problem.

Given a smooth real symmetric matrix valued function , locate the values of the parameters which yield a matrix with degenerate eigenvalues.

###### Example 1.1.

The function

has eigenvalues as shown in Figure 1. Note that the degeneracies occur at isolated points.

For a family of complex Hermitian matrices, the co-dimension of the matrices with multiple eigenvalues is 3. Therefore the analogous question can be posed about locating multiple eigenvalues of a Hermitian . We will formulate an extension of our results to complex Hermitian matrices but will concentrate on the real symmetric case in our proofs.

The problem of locating the points of eigenvalue multiplicity is of practical importance. In condensed matter physics [2] the wave propagation through periodic medium is studied via Floquet–Bloch transform [15, 16] which results in a parametric family of self-adjoint operators (or matrices) with discrete spectrum. The eigenvalue surfaces (sheets of the “dispersion relation”) may touch, see Fig. 1, which has profound effect on wave propagation and its sensitivity to a small perturbation of the medium. This touching corresponds precisely to a multiplicity in the eigenvalue spectrum. To give a well-studied example, the unusual electron properties of graphene occur due to the presence of eigenvalue multiplicity [5, 18].

The question of locating eigenvalue multiplicity in a family of real symmetric matrices has a straightforward solution (which also illustrates why the co-dimension is 2). The discriminant of can be written as a sum of two squares,

(1) |

By definition, the discriminant is if and only if two eigenvalues coincide, therefore we have two conditions that must simultaneously be met for the multiplicity to occur:

(2) |

Unfortunately, for larger matrices the discriminant quickly becomes unwieldy and cannot be used in practical computations. The discriminant can still be written as a sum of squares [13, 17, 19, 6], but the number of terms grows fast with the size of the matrix.

Thus, for an real symmetric matrix depending on
two parameters and there is only one easily computable
function whose root, in variables
and , we are seeking.^{1}^{1}1Here, without loss of
generality, we have assumed that one is interested in the degeneracy
However, to apply a
standard method with quadratic convergence, such as the
Newton–Raphson algorithm, one needs 2 functions for 2 variables.

One can change the basis to make block-diagonal, with a
block corresponding to eigenvalues and
. The existence of this change in a neighborhood of the
multiplicity point is assured if remain bounded away
from the rest of the spectrum. However the new basis will depend on
the parameters and is not directly accessible for numerical
computations. Despite this obstacle, we will show that a “naive”
approach produces equivalently good convergence: one can use a
*constant*eigenvector basis which is recomputed at each point of
the Newton–Raphson iteration. More precisely, we establish the
following theorem.

###### Theorem 1.2.

Let be a real symmetric matrix valued function which is continuously twice differentiable in each entry, with a non-degenerate conical point (defined below) between and at parameter point . For any , define by

(3) |

where

(4) |

denote the eigenvalues of at the point and denote the corresponding eigenvectors.

Then there exists an open neighborhood of and a constant such that for all , the corresponding

satisfies the estimate

(5) |

Before we prove this theorem in Section 4 we explain in Section 2 the geometrical picture behind the iterative procedure (3) and also point out the main differences between (3) and the Newton–Raphson method in a conventional setting. We also review related literature in Section 2.1 once we introduce relevant notions. The precise definition and properties of “nondegenerate conical point” is given in Section 3. Section 5 contain some computational examples.

### 1.1. Notation

We let denote the set of matrix valued functions mapping to with each element being continuously twice differentiable. The eigenvalues of the matrix function are numbered in the increasing order and without loss of generality we will look for such that . Naturally, all results apply equally well to any pair of consecutive eigenvalues. We remark that function are continuous but not necessarily smooth: the points of eigenvalue multiplicity are typically the points where the eigenvalues involved are not differentiable, see Fig. 1.

For any real symmetric matrix valued function and any point , we let denote the representation of in the eigenvector basis computed at point . That is,

is a fixed orthogonal matrix whose columns are the eigenvectors of

. The eigenvectors are assumed to be numbered according to the eigenvalue ordering. This means that is a diagonal matrix at the point but not necessarily anywhere else.We let

(6) |

denote the submatrix of corresponding to the coalescing eigenvectors. By definition of , we have

(7) |

Throughout the paper

will denote the row vector of derivatives taken with respect to parameters

,If is a vector-function, is a matrix with 2 columns. We use the notation to denote the derivative evaluated at the point , i.e.

We use notation to denote the Jacobian of ,

(9) |

where are the eigenvectors of and the derivatives and have been evaluated at point . We remark that in Theorem 1.2 can be calculated as . The factor is the definition of arises naturally in calculations; it can also be used to put the second row terms in the more symmetric form,

Finally, we remark that by our definitions and . Therefore, the tilde (defined in equation (6)) will usually be omitted once we invoke functions and .

## 2. Discussion

### 2.1. Geometric interpretation

What is described in this paper is a variation of the Newton-Raphson method applied to the objective function . This is only one condition on two parameters (in the real case), and leads to an underdetermined Newton-Raphson iteration. In particular, given an initial guess , we would like to update our guess to such that

(10) |

However, there is a whole line of points that satisfies this condition, as illustrated in Figure 2.

To incorporate our knowledge that the degeneracy occurs at an isolated point, we use a heuristic derived from Berry phase

[10, 3, 21], a phenomenon which underlies the inability to find a smooth diagonalisation around a degeneracy: on a loop in the parameter space around a nondegenerate conical point, a continuous choice of eigenvectors must rotate by (as opposed to 0 mod ).But if smoothly going in a loop around the degeneracy rotates the
eigenvectors, the direction of minimal rotation is a
direction *towards the point of degeneracy*. Let
be a smooth choice of
normalized eigenvectors around the point (this is possible
because is not a point of eigenvalue multiplicity). Then we
are looking for the direction in the parameter space in which the
eigenvector as a function of does not
rotate in the plane spanned by
(it may still rotate “out of the plane”). This condition can be
written as

(11) |

Conditions (10) and
(11) together generically^{2}^{2}2See
Sections 3 and 4 for a precise
formulation. define a unique point which can be taken as the
next step in the iteration. We can solve for it explicitly using
the well-known perturbation formulas [4, 14],

(12) | |||

(13) |

where

(14) |

We stress that in equation (14) the eigenvectors are evaluated at the point and do not depend on .

The tangent planes condition (10) and the non-rotation condition (11) can now be written succicntly as

(15) |

or, less succintly, as

which immediately leads to (3).

Berry phase also lies at the heart of another set of works devoted to locating points of eigenvalue multiplicity. Pugliese, Dieci and co-authors [20, 8, 7] developed a procedure which uses Berry phase to grid-search available space and identify regions with conical points. For the final convergence they used the standard Newton–Raphson method to locate the ciritical point of . The convergence of this final step is quadratic, as in Theorem 1.2.

### 2.2. Relation to Newton-Raphson method

Recalling the definition of and in particular equation (7), we have

This allows us to rewrite equation (15) as

which is the same as a single step of Newton–Raphson iteration applied to . In other words, is chosen to be a solution to

(16) |

for some .

To understand the difference of our algorithm from a seemingly
conventional Newton–Raphson method, we need to revisit the
computation of . It can be viewed as first expressing
in the eigenvector basis computed *at the point*
and then extracting the -subblock of the resulting matrix.

In this notation, the problem of finding the degeneracy is equivalent to finding a point such that

(17) |

In contrast, solving equation (16) is a first step in finding a point such that

(18) |

Going all the way to find the solution to equation (18) is pointless; this is not the equation we need to solve. Instead, we go one step, computing the first Newton–Raphson approximation , and then update our target equation to

compute the first Newton–Raphson approximation to
*that* equation and so on.

### 2.3. Complex Hermitian matrices

Let us now consider a complex Hermitian matrix-valued function

. To find a point of eigenvalue multiplicity, we typically need three real parameters (the off diagonal terms can be complex, and that introduces an additional degree of freedom), which we still denote by

.The conditions can now be written as

(19) |

where

(20) |

One can equivalently use the objective function

(21) |

## 3. Conical Intersection

Let be a point in the parameter space such that has a double eigenvalue . The existence of eigenvalue multiplicity precludes a smooth diagonalization in a region containing the degeneracy. However, a smooth block diagonalization exists. The standard construction (see, for example, [14, II.4.2 and Remark 4.4 therein]) uses Riesz projector.

We can choose a contour with enclosing and no other point in the spectrum of . This property of must persist for when is in a small neighborhood of . The Riesz projector

(22) |

projects onto the continuation of the eigenspace of

at [11]. The projector itself is smooth, as the points on the contour are all in the resolvent set of (and so has a bounded inverse for all ). Starting with an arbitrary eigenvector basis at , we can obtain a basis at a nearby by applying Gram-Schmidt procedure to the set , which preserves smoothness. We can do the same with the orthogonal complement and a complementary basis to . To summarize, for some region with , we find a change of basis such that(23) |

where and . We can further diagonalise both and at any point to obtain

(24) |

where , and both

are diagonal at . A stronger result from Hsieh, and Sibuya [12], and Gingold [9] states that such block-diagonalization exists even for matrices that are not necessarily Hermitian, and for any closed rectangular region that contains an isolated degeneracy.

Note that since is a matrix which has an eigenvalue
multiplicity at the point , is a multiple of the
identity. The eigenvalue multiplicity is detected by the
*discriminant* of which in the case is defined as

(25) |

The discriminant achieve its minimum value 0 at the point . It is also a function of and its Hessian is well-defined.

###### Definition 3.1.

A point of eigenvalue multiplicty is *a non-degenerate
conical point* if has a non-degenerate conical
point at .

In other words, there is a positive definite matrix such that

and, along any ray originating at , the eigenvalues are separating at a non-zero linear rate. This picture justifies the use of the term “conical”.

Unfortunately, while existence of is assured, it is not easily accessible analytically. The following theorem provides a more practical method of checking if is non-degenerate.

###### Theorem 3.2.

The Hessian of at is given by

(26) |

Consequently, is a non-degenerate conical point if and only if .

We remark the it is the same that appears in the denominator in Theorem 1.2. The condition

has a nice geometric meaning: it is precisely the condition that the manifold

of real symmetric matrices is transversal to the line of symmetric matrices with repeated eigenvalues.The choice of basis in the definition of is assumed to align with the choice of basis used to compute , i.e. the first two columns of are the eigenvectors used to compute . This choice does not affect the definition of the non-degenerate point because of the following lemma.

###### Lemma 3.3.

Let be a matrix-valued function of . Then for any orthogonal matrix there is an orthogonal matrix such that for all we have

(27) |

and therefore

(28) |

###### Proof.

This identity for matrix-functions can be checked by direct computation but the details are excessively tedious. Instead we use a more generalizable approach.

We fix an orthogonal and let denote the linear space of real symmetric matrices. The map , see equation (8

), acts as a linear transformation from

to . It is obviously onto and has the kernel consisting of multiples of the identity. On the other hand, conjugation by (namely the map ) is a linear transformation of to itself. It maps multiples of the identity to themselves and therefore induces a linear transformation from the quotient space to itself. This linear transformation, via the isomorphism between and , induces a linear transformation on mapping to .We summarize the above in the commutative diagram

In other words, for a given orthogonal , there exists a constant matrix such that

From the identity (see (25) for the definition of discriminant)

we conclude that is orthogonal. Finally, taking derivatives we get

since determinant of an orthogonal matrix is either or . ∎

###### Lemma 3.4.

###### Proof.

We remark that identity (29) is only claimed for the Jacobian evaluated at the point where both and are diagonal, therefore .

For all , are orthonormal and differentiating we get

(30) |

We can now relate the derivatives of to the derivatives of ,

The calculation is identical for derivatives. ∎

## 4. Proof of the main result

Here we restate the procedure used to locate the degeneracy in the notation that has been introduced.

###### Theorem 4.1.

For a family of matrix functions , define by

(32) |

Let have a non-degenerate conical point at between eigenvalues and . Then there exists an open with and , such that for all ,

(33) |

where the matrix-function is defined by

(34) |

with the constant matrix whose columns are the eigenvectors of .

We remark that the assumption of non-degeneracy of the conical point is justified, for example, by the fact that any degenerate conical point can be made non-degenerate by a small perturbation of the function .

We recall that the superscript in refers to the basis which is computed at the point and in which the matrix is represented. The derivatives of that are taken to compute in (32), are also evaluated at the point . The result of evaluating is explicitly written out in equations (3)-(4).

###### Proof.

We present a brief outline of the proof which combines several facts established in the remained of this Section.

Now we establish the lemmas used in the proof of Theorem 4.1.

###### Lemma 4.2.

There exists with and such that

(35) |

when .

###### Proof.

This is the usual Newton–Raphson method applied to conical point search for the matrix . For completeness we provide the proof. For the function , we have the Taylor expansion around the point which is evaluated at the point ,

where the constant in is *independent*
of as long as it is in a neighborhood
of . The dot denotes the
matrix-by-vector multiplication (to distiguish it from the argument
of the function ).

By assumption , and, by smoothness, we know that is boundedly invertible in some region containing . Therefore, for the point , or equivalently,

we have

with the estimate (35) following by inverting . ∎

###### Lemma 4.3.

For any and constant, orthogonal , we have

(36) |

###### Proof.

###### Lemma 4.4.

There exists with and such that

(37) |

when .

###### Proof.

By the assumption that is a non-degenerate conical point and equation (26), we have that and therefore has a bounded inverse in a region around . By equation (29) we conclude that also has a bounded inverse in some region around where is small. We can express the difference of the inverses as

and so, using boundedness of and its derivatives, we get

We also recall that by definition of and ,

## 5. Examples

### 5.1. Elements of A are linear in parameters

If is linear in each parameter, we have , where and for some , that depend on , and . The eigenvalues of this matrix are values of where ,

which is a cone in the new parameter space. In fact, a simple calculation shows that the degeneracy of the function , which has the same eigenvectors and shifted eigenvalues, has eigenvectors , can be located using a single step of the above rule.

### 5.2. Non-linear examples

Consider the following matrix-function example,

(38) |

Since is a rank-one perturbation of a diagonal matrix, it can be shown that there is a double eigenvalue at the point given by

or . The results of running the algorithm of Theorem 1.2 with random starting points in the rectangle is shown in Figure (a)a.

The complex Hermitian case described in Section 2.3 is demonstrated in Figure (b)b. The matrix

(39) |

corresponds to the discrete Laplacian of the graph shown in Figure 4 with dashed edges carrying a magnetic potential ( and correspondingly). The parameter is introduced artificially, and the conical point found numerically has value . Since the location of the conical point is not known analytically, the error is estimated using the norms of updates instead of . The result of several runs of the algorith is shown in Figure (b)b.

### 5.3. Avoided crossing

While a non-degenerate conical point is stable under small perturbations of the real symmetric matrix-function

Comments

There are no comments yet.