 # On the optimality and sharpness of Laguerre's lower bound on the smallest eigenvalue of a symmetric positive definite matrix

Lower bounds on the smallest eigenvalue of a symmetric positive definite matrices A∈R^m× m play an important role in condition number estimation and in iterative methods for singular value computation. In particular, the bounds based on Tr(A^-1) and Tr(A^-2) attract attention recently because they can be computed in O(m) work when A is tridiagonal. In this paper, we focus on these bounds and investigate their properties in detail. First, we consider the problem of finding the optimal bound that can be computed solely from Tr(A^-1) and Tr(A^-2) and show that so called Laguerre's lower bound is the optimal one in terms of sharpness. Next, we study the gap between the Laguerre bound and the smallest eigenvalue. We characterize the situation in which the gap becomes largest in terms of the eigenvalue distribution of A and show that the gap becomes smallest when Tr(A^-2)/{ Tr(A^-1)}^2 approaches 1 or 1/m. These results will be useful, for example, in designing efficient shift strategies for singular value computation algorithms.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

Let be a symmetric positive definite matrix and denote the smallest eigenvalue of by . In this paper, we are interested in a lower bound on . If the Cholesky factorization of is , where is a nonsingular lower triangular matrix, the smallest singular value of can be written as . Hence, finding a lower bound on is equivalent to finding a lower bound on .

A lower bound on or plays an important role in various scientific computations. For example, when combined with an upper bound on , a lower bound on can be used to give an upper bound on the condition number of . In singular value computation algorithms such as the dqds algorithm , the orthogonal qd algorithm  and the mdLVs algorithm , a lower bound on is used as a shift to accelerate the convergence. In the latter case, the matrix is usually a lower bidiagonal matrix as a result of preprocessing by the Householder method .

Several types of lower bounds on or have been proposed so far. There are bounds based on eigenvalue inclusion theorems such as Gershgorin’s circle theorem  or Brauer’s oval of Cassini . The norm of the inverse, , can also be used to bound the maximum eigenvalue of from above, and therefore to bound from below. There are also bounds based on the traces of the inverses, namely, and . Among them, the last class of bounds are attractive in the context of singular value computation, because they always give a valid (positive) lower bound, as opposed to the bounds based on the eigenvalue inclusion theorems, and they can be computed in work using efficient algorithms [9, 11, 13]. Examples of lower bounds of this type include the Newton bound , the generalized Newton bound [9, 1] and the Laguerre bound .

In this paper, we focus on the lower bounds of derived from and and investigate their properties. In particular, we will address the following two questions. The first is to identify an optimal formula for a lower bound on that is based solely on and . Here, the word ”optimal” means that the formula always gives a sharper (that is, larger) bound than any other formulas using only and . As a result of our analysis, we show that the Laguerre bound mentioned above is the optimal formula in this sense. The second question is to evaluate the gap between the Laguerre bound and . Unlike the Laguerre bound, is not determined uniquely only from and . Hence, for some of the matrices, there must be a gap between the bound and . Our problem is to quantify the maximum possible gap and identify the conditions under which the maximum gap is attained. These results will be useful, for example, in designing an efficient shift strategy for singular value computation algorithms, which combines the Laguerre bound with other bounds with complementary characteristics .

The rest of this paper is structured as follows. In Section 2, we investigate the lower bounds on derived from and and show that the Laguerre bound is an optimal one in terms of sharpness. Section 3 deals with the gap between the Laguerre bound and . In particular, we characterize the situation in which the gap becomes largest in terms of the eigenvalue distribution of . Section 4 gives some concluding remarks.

## 2. An optimal lower bound based on Tr(A−1) and Tr(A−2)

### 2.1. Lower bounds based on Tr(A−1) and Tr(A−2)

Let be an real symmetric positive matrix. We denote the th largest eigenvalue of by , or for short. Let be the characteristic polynomial of . To find a lower bound on the smallest eigenvalue , we consider applying a root finding method for an algebraic equation to starting from the initial value . There are several root finding methods, such as the Bailey’s (Halley’s) method , Householder’s method  and Laguerre’s method [14, 10], for which the iteration formulas can be written as follows:

 (2.1) λ(n+1)B = λ(n)−f(λ(n))f′(λ(n))⋅11−f(λ(n))f′′(λ(n))2f′(λ(n))2, (2.2) λ(n+1)H = λ(n)−f(λ(n))f′(λ(n)){1+f(λ(n))f′′(λ(n))2f′(λ(n))2}, (2.3) λ(n+1)L = λ(n)−f(λ(n))f′(λ(n)) ×m1+√(m−1){m⋅f′(λ(n))2−f(λ(n))f′′(λ(n))f′(λ(n))2−1}

Eqs. (2.1), (2.2) and (2.3) represent the iteration formulas of Bailey’s method, Householder’s method and Laguerre’s method, respectively. When applied to starting from , these formulas produce a sequence that increases monotonically and converges to . Hence, all of , and can be used as a lower bound on .

Noting that , we have

 (2.4) f′(λ) = −m∑k=1∏j≠k(λj−λ) = −m∏j=1(λj−λ)m∑k=11λk−λ=−f(λ)Tr((A−λI)−1), (2.5) f′′(λ) = −f′(λ)Tr((A−λI)−1)−f(λ)m∑k=11(λk−λ)2 = −f′(λ)Tr((A−λI)−1)−f(λ)Tr((A−λI)−2).

Hence,

 (2.6) f(λ)f′(λ) = −1Tr((A−λI)−1), (2.7) f(λ)f′′(λ)f′(λ)2 = 1−Tr((A−λI)−2){Tr((A−λI)−1)}2.

Inserting these into Eqs. (2.1), (2.2) and (2.3) with , we obtain the following lower bounds on :

 (2.8) LB(A) = 2Tr(A−1){Tr(A−1)}2+Tr(A−2), (2.9) LH(A) = 1Tr(A−1)[32−12⋅Tr(A−2){Tr(A−1)}2], (2.10) LL(A) = 1Tr(A−1)⋅m1+√(m−1)[m⋅Tr(A−2){Tr(A−1)}2−1].

We call , and the Bailey bound, the Householder bound and the Laguerre bound, respectively. In addition to these, we also have a simple bound:

 (2.11) LN(A)={Tr(A−2)}−12≤(m∑k=11λ2k)−12<λm,

which is called the Newton bound of order 2 [10, 9, 1]. In the case where is a tridiagonal matrix, both and can be computed in work from its Cholesky factor [9, 11, 13]. Accordingly, any of these bounds can be employed in a practical shift strategy for singular value computation algorithms. The problem then is which of the four lower bounds, or possibly another bound derived from and , is optimal in terms of sharpness.

### 2.2. The optimal lower bound

To answer the question, we reformulate the problem as follows. Assume that and are specified for a symmetric positive definite matrix . Then, how small can the smallest eigenvalue be? If this bound can be obtained explicitly as a function of and , then, it will be the optimal formula for the lower bound of .

Now, let , and (). Then, the upper bound on (the reciprocal of the lower bound on ) can be obtained by solving the following constrained optimization problem:

 (2.12) maximizexm (2.16) s.t. m∑k=1xk=a, m∑k=1x2k=b, xk>0(k=1,2,…,m), x1≤x2≤⋯≤xm.

Actually, the constraint (2.16) is redundant, because if is a solution of the optimization problem without constraint (2.16), then, from symmetry, is also a solution for any permutation of , and therefore we can choose a solution that satisfies (2.16). Hence we omit (2.16) in the following.

To solve the optimization problem (2.12)–(2.16), we remove the constraint (2.16) and consider a relaxed problem described by (2.12)–(2.16). By introducing the Lagrange multipliers and , we can write the Lagrangian as

 (2.17) L=xm−μ(m∑k=1xk−a)−ν(m∑k=1x2k−b).

Then the solution to (2.12)–(2.16) must satisfy

 (2.18) ∂L∂xm = 1−μ−2νxm=0, (2.19) ∂L∂xk = −μ−2νxk=0(k=1,2,…,m−1), (2.20) ∂L∂λ = m∑k=1xk−a=0, (2.21) ∂L∂μ = m∑k=1x2k−b=0.

From (2.19), we have either or . However, when , we have from (2.19) and from (2.18), which is a contradiction. Thus must hold. Inserting this into (2.20) and (2.21) leads to

 (2.22) xm+(m−1)x1−a=0, (2.23) x2m+(m−1)x21−b=0.

Solving these simultaneous equations with respect to gives

 (2.24) x±m=a±√m(m−1)b−(m−1)a2m.

Note that the given by (2.24) is real, since

 (2.25) m(m−1)b−(m−1)a2 = (m−1)⎧⎨⎩mm∑k=1x2k−(m∑k=1xk)2⎫⎬⎭ = (m−1)m∑k=1k−1∑l=1(xk−xl)2≥0.

Now we return to the relaxed optimization problem (2.12)–(2.16). Since the feasible set of this problem is compact and both the objective function and the constraints are differentiable, it must have a minimum and a maximum at a point where the gradient of the Lagrangian is zero. Furthermore, since the objective function is itself, the maximum is attained when . Then, from Eq. (2.22), we have

 (2.26) x1=x2=⋯=xm−1=(m−1)a−√m(m−1)b−(m−1)a2m(m−1).

Hence, Eq. (2.26) and are the solution of the relaxed optimization problem.

Finally, we consider the positivity constraint (2.16). It is clear from (2.24) that . To investigate the positivity of the other variables, note that

 (2.27) a2−b=(m∑k=11λk)2−m∑k=11λ2k=2m∑k=1k−1∑l=11λk⋅1λl>0,

where we used the fact that and are the traces of the inverse of a matrix with positive eigenvalues. Then (2.26) can be rewritten as

 (2.28) x1=x2=⋯=xm−1 = (m−1)2a2−{m(m−1)b−(m−1)a2}m(m−1){(m−1)a+√m(m−1)b−(m−1)a2} = m(m−1)(a2−b)m(m−1){(m−1)a+√m(m−1)b−(m−1)a2}>0.

This shows that the solution to the relaxed problem (2.12)–(2.16) automatically satisfies the constraint (2.16). Hence it is also a solution to the original problem (2.12)–(2.16). Returning to the original variables , we know that the smallest value that can take is

 (2.29) 1Tr(A−1)⋅m1+√(m−1)[m⋅Tr(A−2){Tr(A−1)}2−1].

This gives the optimal lower bound on in terms of and . Since Eq. (2.29) is exactly the Laguerre bound (2.10), we arrive at the following theorem.

###### Theorem 2.1.

Among the lower bounds on computed from and , the Laguerre bound (2.10) is optimal in terms of sharpness.

## 3. The gap between the Laguerre bound and the smallest eigenvalue

Now that we have established that the Laguere bound is the optimal lower bound, we next study the gap between the bound and the minimum eigenvalue. We begin with a lemma that holds for a matrix and then proceed to the general case. In the course of discussion, we also allow infinite eigenvalues to make the arguments simpler.

Assume that is a symmetric positive definite matrix with and . Let the eigenvalues of be . To evaluate the gap, we consider how large can be under the fixed values of and . First, we show the following lemma.

###### Lemma 3.1.

For fixed and , can take a maximum only when or .

###### Proof.

Let , and . Since we allow infinite eigenvalues, the point lies in a region of the space specified by , , and . Since is a compact set, the continuous function attains a minimum somewhere in . Hence, if we can show that does not attain a minimum when and , it means that attains a minimum when or .

Assume that the point is in and both and hold. Then, let be some small quantity and and consider changing to as follows:

 (3.1) x′ = x−ϵ, (3.2) y′ = y+tϵ, (3.3) z′ = z+(1−t)ϵ.

Clearly, the new point lies on the plane . We determine so that it is also on the sphere . The condition can be written as

 (3.4) (x−ϵ)2+(y+tϵ)2+{z+(1−t)ϵ}2=x2+y2+z2,

or

 (3.5) ϵt2+(y−z−ϵ)t+(−x+z+ϵ)=0.

Solving this with respect to gives

 (3.6) t±=−(y−z−ϵ)±√(y−z−ϵ)2+4ϵ(x−z−ϵ)2ϵ.

In the following, we adopt the solution . Now we consider two cases. First, consider the case of . Then, we have from Eq. (3.6),

 (3.7) t+ϵ=ϵ+√ϵ2+4ϵ(x−z−ϵ)2=O(√ϵ).

Inserting this into (3.1) through (3.3), we know that the changes in , and are at most when is small.

Next, consider the case of . In this case, we can rewrite (3.6) as

 (3.8) t+=2(x−z−ϵ)(y−z−ϵ)+√(y−z−ϵ)2+4ϵ(x−z−ϵ).

Since and for a sufficiently small , we have

 (3.9) 0

Hence, when is small and therefore the changes in , and are at most in this case.

In summary, in both cases, the changes of , and can be made arbitrarily small. Thus, by choosing sufficiently small, we can make smaller than while keeping the relation and (Fig. 1). The relation may not hold, but in that case, we can interchange and . In this way, we can obtain another point which attains a smaller value of . Hence cannot attain a minimum when both and hold and the lemma is proved. ∎ Figure 1. The values of x, y and z before and after the perturbation.

Using this lemma, we can prove the following theorem.

###### Theorem 3.2.

Let and be fixed and be an integer satisfying . Then, takes a maximum when and . The maximum is given as

 (3.10)
###### Proof.

Let . First, assume that there are two or more eigenvalues which are neither an infinite eigenvalue nor equal to . In this case, as we will show in the following, we can make smaller by adding appropriate perturbations. We divide the cases depending on the multiplicity of the smallest eigenvalue.

When , from the assumption, both and are neither an infinite eigenvalue nor equal to . Thus, we have . Then, by picking up these three variables and adding the same perturbations as in Lemma 3.1, we can make smaller while keeping the condition (Fig. 2). Clearly, the values of and are unchanged by this perturbation. Hence, cannot take a minimum in this case. Figure 2. The values of xm, xm−1 and xm−2 before and after the perturbation.

When , holds from the assumption. Then, by picking up the three variables , and and adding the perturbations as in Lemma 3.1, we can make smaller while keeping . This does not change the smallest eigenvalue, but reduces its multiplicity from to (Fig. 3). Moreover, the condition that there are two or more eigenvalues which are neither an infinite eigenvalue nor equal to still holds. Hence, we can repeat this procedure and reduce to 1, while keeping the value of the smallest eigenvalue unchanged. But in this last situation, cannot take a minimum, as concluded in the analysis of the case. Figure 3. The values of xm−q−1,xm−q,…,xm before and after the perturbation.

From the above analysis, we can conclude that cannot take a minimum when there are two or more eigenvalues which are neither an infinite eigenvalue nor equal to . Thus, the only possible case is when and holds for some . In this case, we have

 (3.11) xm−q+qxm = a, (3.12) x2m−q+qx2m = b,

or

 (3.13) x±m = aq±√q{(q+1)b−a2}q(q+1), (3.14) x±m−q = a∓√q{(q+1)b−a2}q+1.

For and to be real, must satisfy . Then, for to hold, we have to choose and . In addition, for to hold, we must have . From the condition , is determined uniquely. Hence, there is only one set of , and that satisfy the condition for minimum . Since the feasible region of , specified by , and , is compact, must have a minimum somewhere in this region. Accordingly, we conclude that takes a minimum when , , and . Eq. (3.10) is obtained from . ∎

To measure the gap between the Laguerre bound and the smallest eigenvalue, we use the quantity , which becomes one when there is no gap and zero when the gap is maximal. Let and be an integer specified in Theorem 3.2. Then, we have from Eqs. (3.10) and (2.10),

 (3.15) LL(A)λ∗m(A)=mq(q+1)⋅q+√q{(q+1)α−1}1+√(m−1)(mα−1).

Thus, we have obtained an expression for the maximum possible gap as a function of and (note that is determined from uniquely).

So far, we have allowed infinite eigenvalues. However, of course, actual matrices have only finite eigenvalues. Accordingly, except for the case of , for which no infinite eigenvalues are required for to take a maximum, the right-hand side of (3.15) is a lower bound that can be approached arbitrarily closely.

Finally, we investigate the behavior of the right-hand side of (3.15) as a function of . Note that from (2.25) and (2.27). We consider three extreme cases, namely, , and .

• When , we have and therefore

 (3.16) LL(A)λ∗m(A)=1m−1⋅{1+m−21+√(m−1)(mα−1)}.

This is a decreasing function in and takes the maximum value 1 at and approaches as . Hence all over the region.

• When , we have and therefore .

• When , we have and therefore

 (3.17) LL(A)λ∗m(A)=m2⋅1+√2α−11+√(m−1)(mα−1).

For , this is an increasing function in that takes the minimum value

 (3.18) 1√2⋅1√2m+√(1−1m)(1−2m)

at and approaches 1 as . Thus, when is large, is almost larger than all over the region.

In summary, we can conclude that the Laguerre bound is fairly tight when is close to or greater than and can be loose when is in the intermediate region.

In Fig. 4, we plot the smallest eigenvalues of randomly generated symmetric positive definite matrices. These matrices are normalized so that and the horizontal axis is . The Laguerre bound (2.10) and the upper bound (3.10) on the smallest eigenvalue are also shown in the graph. From the graph, we can confirm the optimality of the Laguerre bound, since it actually constitutes the lower boundary of the region where the smallest eigenvalues exist. We also see that the upper boundary is given by (3.10). Finally, it is clear that the Laguerre bound is tight when or and loose in the intermediate region. Figure 4. The smallest eigenvalues of randomly generated 5×5 symmetric positive definite matrices as a function of α.

## 4. Conclusion

In this paper, we investigated the properties of lower bounds on the smallest eigenvalue of a symmetric positive definite matrix computed from and . We studied two problems, namely, finding the optimal bound and evaluating its sharpness. As for the first question, we found that the Laguerre bound is the optimal one in terms of sharpness. As for the second question, We characterized the situation in which the gap becomes largest in terms of the eigenvalue distribution of . Furthermore, we showed that the gap becomes smallest when approaches 1 or . These results will help designing efficient shift strategies for singular value computation methods such as the dqds algorithm and the mdLVs algorithm.

#### Acknowledgment

The author would like to thank Prof. Kinji Kimura, Dr. Yuji Nakatsukasa and Dr. Takumi Yamashita for fruitful discussion.

## References

•  K Aishima, T. Matsuo, K. Murota and M. Sugihara: A survey on convergence theorems of the dqds algorithm for computing singular values. Journal of Math-for-industry. 2 (2010) 1–11.
•  G. Alefeld: On the convergence of Halley’s method. American Mathematical Monthly. 88 (1981) 530–536.
•  K. V. Fernando and B. N. Parlett: Accurate singular values and diferential qd algorithms. Numer. Math. 67 (1994), 191–229.
•  G. H. Golub and C. F. van Loan: Matrix Computations. 4th Ed. Johns Hopkins Univ. Press, 2012.
•  A. Householder: The Numerical Treatment of a Single Nonlinear Equation. McGraw-Hill, 1970.
•  M. Iwasaki and Y. Nakamura: Accurate computation of singular values in terms of shifted integrable schemes. Japan J. Indust. Appl. Math. 23 (2006), 239–259.
•  C. R. Johnson: A Gersgorin-type lower bound for the smallest singular value. Linear Algebra Appl. 112 (1989) 1–7.
•  C. R. Johnson and T. Szulc: Further lower bounds for the smallest singular value. Linear Algebra Appl. 272 (1998) 169–179.
•  K. Kimura, T. Yamashita and Y. Nakamura: Conserved quantities of the discrete finite Toda equation and lower bounds of the minimal singular value of upper bidiagonal matrices. J. Phys. A: Math. Theor. 44 (2011) 285207 (12pp.).
•  U. von Matt: The Orthogonal qd-Algorithm. SIAM J. Sci. Comput. 18 (1997), 1163–1186.
•  T. Yamashita, K. Kimura and Y. Nakamura: Subtraction-free recurrence relations for lower bounds of the minimal singular value of an upper bidiagonal matrix. Journal of Math-for-Industry 4 (2012) 55–71.
•  T. Yamashita, K. Kimura, M. Takata and Y. Nakamura: An application of the Kato-Temple inequality on matrix eigenvalues to the dqds algorithm for singular values. JSIAM Lett. 5 (2013) 21–24.
•  T. Yamashita, K. Kimura and Y. Yamamoto: A new subtraction-free formula for lower bounds of the minimal singular value of an upper bidiagonal matrix. Numer. Algor. 69 (2015), 893–912.
•  J. H. Wilkinson: The Algebraic Eigenvalue Problem, Revised edition, Oxford University Press, 1988.