# Pole-swapping algorithms for alternating and palindromic eigenvalue problems

Pole-swapping algorithms are generalizations of bulge-chasing algorithms for the generalized eigenvalue problem. Structure-preserving pole-swapping algorithms for the palindromic and alternating eigenvalue problems, which arise in control theory, are derived. A refinement step that guarantees backward stability of the algorithms is included. This refinement can also be applied to bulge-chasing algorithms that had been introduced previously, thereby guaranteeing their backward stability in all cases.

## Authors

• 5 publications
• 5 publications
• 2 publications
08/07/2020

### Solving two-parameter eigenvalue problems using an alternating method

We present a new approach to compute selected eigenvalues and eigenvecto...
06/20/2019

### On pole-swapping algorithms for the eigenvalue problem

Pole-swapping algorithms, which are generalizations of the QZ algorithm ...
07/01/2019

### Nonlinearization of two-parameter eigenvalue problems

We investigate a technique to transform a linear two-parameter eigenvalu...
07/01/2019

### Nonlinearizing two-parameter eigenvalue problems

We investigate a technique to transform a linear two-parameter eigenvalu...
05/04/2021

### Backward Stability of Explicit External Deflation for the Symmetric Eigenvalue Problem

A thorough backward stability analysis of Hotelling's deflation, an expl...
01/03/2020

### Eigenvalue embedding of undamped piezoelectric structure system with no-spillover

In this paper, the eigenvalue embedding problem of the undamped piezoele...
09/06/2021

### Broken-FEEC approximations of Hodge Laplace problems

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Francis’s implicitly-shifted QR algorithm Fra61b ; Wat11 is still the standard tool for computing the eigenvalues of a small to medium-sized non-Hermitian matrix . Eigenvalue problems often arise naturally as generalized eigenvalue problems for a pencil , and for these the Moler-Stewart variant MolSte73 of Francis’s algorithm, commonly called the QZ algorithm, is used. The Francis and Moler-Stewart algorithms are prime examples of what we call bulge-chasing algorithms.

In recent years some generalizations of the QZ algorithm have been proposed, e.g. VanWat12g and, more generally, the rational QZ (RQZ) algorithm of Camps, Meerbergen, and Vandebril CaMeVa19a , which is the first example of what we call a pole-swapping algorithm. This work has been extended in various directions in Cam19 ; CaMaVaWa19 ; CaMeVa19b .

In this paper we extend in another direction, introducing structure-preserving pole-swapping algorithms for two classes of structured generalized eigenvalue problems that arise in optimal control theory Meh91 , namely palindromic and alternating eigenvalue problems. Kressner, Schröder, and Watkins KrScWa08 proposed structure-preserving bulge-chasing (QZ-like) algorithms for palindromic and alternating eigenvalue problems. We show that our structured pole-swapping algorithms are generalizations of these bulge-chasing algorithms. Our algorithms include a refinement step, which can also be incorporated into the algorithm of KrScWa08 , to ensure backward stability.

## 2 Basic definitions and facts

In this paper we will refer sometimes to a pencil and other times to a pair . Either way, we are talking about the same object. The pair is called palindromic (or -palindromic) if and are related by

 A∗=B. (1)

The pair is alternating (or even) if

 A∗=AandB∗=−B. (2)

These two structures are equivalent in principle, since the generalized Cayley transform maps a palindromic pair to an alternating pair and vice versa, since the underlying Möbius transformation is its own inverse.

Each of these structures exhibits a spectral symmetry. In the palindromic case, is an eigenvalue if and only if is. This is shown by writing down the equation , taking the conjugate transpose, and applying (1). Clearly if and only if is on the unit circle. Eigenvalues on the unit circle need not occur in pairs, but those off the unit circle must appear in pairs, one inside and one outside the unit circle.

In the alternating case, is an eigenvalue if and only if is, as can be shown in the same way as for the palindromic case. Since if and only if is on the imaginary axis, eigenvalues on the imaginary axis need not occur in pairs, but those off of the imaginary axis must occur in pairs, one on each side of the imaginary axis.

The spectral symmetry in alternating pencils is exactly the same as that possessed by Hamiltonian matrices. Recall that a matrix is called Hamiltonian if is Hermitian, where

 J=[Im−Im].

It is well known Meh91 that the continuous-time linear-quadratic control problem can be solved by computing the eigensystem of a Hamiltonian matrix. One can equally well formulate the problem as an eigenvalue problem for an alternating pencil. In fact, the Hamiltonian eigenvalue problem is clearly equivalent to the alternating eigenvalue problem

. Hamiltonian eigenvalue problems are also sometimes studied in the guise of Hamiltonian/skew-Hamiltonian pencils, but note that the Hamiltonian/skew-Hamiltonian pencil

is equivalent to the alternating pencil . Alternating pencils are discussed in various guises and contexts in MehWat01 ; MehWat02 ; ApMeWa02 ; MaMaMeMe06a ; MaMaMeMe06b , for example.

The spectral symmetry in palindromic pencils is the same as that possessed by symplectic matrices. The discrete-time linear-quadratic optimal control problem can be solved by computing the eigensystem of a symplectic matrix or pencil Meh91 , which can also be formulated as a palindromic eigenvalue problem MaMaMeMe06b ; MaMaMeMe09 ; KrScWa08 .

In KrScWa08 we considered pairs for which both and have anti-Hessenberg form:

shown here in the case . We will consider the same form here. We could study instead the equivalent pair , where is the flip or anti-identity matrix. and are both upper Hessenberg, but the anti-Hessenberg form is more convenient for the special structures that we are considering here.

Following Camps, Meerbergen, and Vandebril CaMeVa19a we associate poles with the anti-Hessenberg pair . For , …, the pole in position is the ratio . We assume that for each , , since otherwise it would be possible to reduce the problem immediately to two or more smaller eigenvalue problems. Thus every is well defined (but might equal ).

In the two structured cases that we are considering, the poles exhibit the same symmetry as the eigenvalues do. In the palindromic case we have , and it follows that

 σk=1/¯¯¯σn−k,k=1,…,n−1.

In the case of even , there is one unpaired pole , which must satisfy . In the alternating case we have

 σk=−¯¯¯σn−k,k=1,…,n−1.

If is even, there is one unpaired pole , which must lie on the imaginary axis.

## 3 Operations on anti-Hessenberg pairs

Introducing terminology that we have used in some of our recent work AuMaRoVaWa18 ; AuMaRoVaWa18g ; AuMaRoVaWa19 ; AuMaVaWa15 , we define a core transformation (or core

for short) to be a unitary matrix that acts only on two adjacent rows/columns, for example,

 Q3=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣11∗∗∗∗1⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦,

where the four asterisks form a unitary matrix. Givens rotations are examples of core transformations. Our core transformations always have subscripts that tell where the action is: acts on rows/columns and .

In CaMeVa19a two types of operations on upper Hessenberg pencils were introduced. These are unitary equivalence transformations, which we called moves of types I and II in CaMaVaWa19 . Obviously we can do the same sorts of moves on anti-Hessenberg pencils, but if we want to preserve the palindromic or alternating structure, we need to use special unitary equivalence transformations, namely congruences .

### Move of Type I

This move replaces the pole (located at position ) by any other value . At the same time, since the symmetry must be preserved, the pole is changed appropriately. In the palindromic case, , and it gets changed to . In the alternating case, , and it gets changed to .

To see how this is done, we at first ignore the need to preserve structure and consider how we would insert a pole at position

. Because of the anti-Hessenberg form, the vector

consists of zeros, except for the last two entries.111Here and in what follows, the notation is shorthand for , where and are any scalars satisfying . This allows us to include the case by taking . Therefore a core transformation , acting on rows and , can be constructed so that zeros out the entry in position , that is,

 Q∗n−1(A−ρB)e1=γen

for some nonzero . Now let . This new pencil has the pole in position , as desired, since . This is exactly the move of type I described in CaMaVaWa19 and earlier in CaMeVa19a , turned over for the anti-Hessenberg case.

But this transformation does not preserve the symmetry of the pencil. What is needed is a congruence: . The right multiplication by does not affect what happens in the lower left-hand corner of the pencil, so the pole in position is , as shown above. But the right multiplication by does affect the pole at position , changing it (by symmetry) to in the palindromic case and in the alternating case.

### Move of Type II

This move swaps two adjacent poles. If we delete the th row and column from the anti-Hessenberg pencil , we get an anti-triangular pencil , which we call the pole pencil because its eigenvalues are exactly the poles of . Swapping two adjacent poles in is the same as swapping two eigenvalues in , and there are well-known procedures for doing this BaiDem93 , CaMaVaWa19 , VanD81 , (Wat07, , §§ 4.8, 6.6).

Suppose we want to swap two adjacent poles and located at positions and . Temporarily ignoring symmetry, this can be accomplished by a transformation . All of the action is in the subpencil

 [0an−k,kan−k+1,k−1an−k+1,k]−λ[0bn−k,kbn−k+1,k−1bn−k+1,k]. (3)

This is the principal subpencil of the bulge pencil that contains the two poles and .

In order to preserve the symmetry we will also have to swap the poles and in positions and . The appropriate transformation is, by symmetry, . This can be done provided the two transformations do not interfere with each other, i.e. and act on non-overlapping columns. This is the case if or , that is, or . The total transformation is , where .

### A hint at where we are heading

In CaMeVa19a ; CaMaVaWa19 it was shown how moves of types I and II can be used to build algorithms for computing eigenvalues of an upper Hessenberg pencil. In the simplest case a shift is chosen, and a move of type I is used to insert it as a pole at the top of the pencil. Then a sequence of moves of type II is used to exchange the pole downward until it gets to the bottom of the pencil. Then it is removed from the bottom (replaced by some new pole ) by a move of type I. This procedure can be shown CaMeVa19a to be a generalization of (one iteration of) the QZ algorithm on a Hessenberg-triangular pencil.

In our current scenario the matrices are anti-Hessenberg, not Hessenberg, but that is a trivial difference. More importantly, in the moves described here, everything is doubled up for preservation of structure. If we introduce a pole at one end of the pencil, we must simultaneously introduce a pole ( or , depending on the structure) at the other end. Now if we want to move from one end of the pencil to the other by moves of type II, we must simultaneously move in the opposite direction. There comes a point in the middle where we have to swap and , so that they can continue their journey to the opposite end of the pencil. This requires a special symmetric version of the move of type II.

### Move of Type IIo

Suppose the dimension

is odd. Then there is an even number of poles

, …, . The two poles in the middle are and , where . These are the two eigenvalues of (3) with . In the interest of simplicity and non-proliferation of notation, we rename this subpencil

 A−λB=[α1α2α21]−λ[β1β2β21].

We are temporarily re-assigning the symbols and to stand for the submatrices that are our current focus. This little pole pencil has the same structure as the original pencil, either palindromic or alternating, but we will ignore the structure at first. The eigenvalues are and , and we would like to swap them. If we flip the rows and columns we get

 FAF−λFBF=[α21α2α1]−λ[β21β2β1],

which has the poles in desired positions but the “wrong” triangularity. We have to fix this. We find it convenient to work with the partially flipped form

 AF−λBF=[α1α21α2]−λ[β1β21β2],

which is easier to study because the matrices are triangular, not anti-triangular.

Next we set up and solve a Sylvester equation to diagonalize the pencil. Specifically, we will find unit lower triangular matrices

 X=[1x1]andY=[1y1]

such that

 (AF)X=Y(ˇAF)and(BF)X=Y(ˇBF), (4)

where and are anti-diagonal matrices with the same anti-diagonals as and , respectively. Writing the first of these equations out in detail, we have

 [α1α21α2][1x1]=[1y1][α1α2],

and similarly for the equation. This is a system of two linear equations in two unknowns

 α21+α2x=yα1,β21+β2x=yβ1

or

 [α1α2β1β2][−y−x]=[α21β21].

This has a unique solution if and only if , i.e. the poles are distinct.

Thus, assuming distinct poles, (4) has a unique solution, which we can easily and stably compute. Rewriting (4) we have the equivalence

 FY−1(A−λB)(FX)=F(ˇA−λˇB)F=[α2α1]−λ[β2β1], (5)

which (still) has the poles in the desired locations, but this equivalence is not unitary. To make a unitary equivalence we will introduce some QR decompositions. Since we are going to work with triangular matrices, we again remove the

from the left-hand side to obtain

 Y−1(A−λB)(FX)=(ˇA−λˇB)F=[α1α2]−λ[β1β2].

Let

 FX=QRandY=PS,

where and are unitary, and and are upper triangular with positive entries on the main diagonal. Then

 P∗(A−λB)Q=S(ˇAF−λˇBF)R−1=TA−λTB,

where and are upper triangular:

 TA=S(ˇAF)R−1=[s11α1r−111∗s22α2r−122],
 TB=S(ˇBF)R−1=[s11β1r−111∗s22β2r−122].

Since we want anti-triangular matrices, we now restore the on the left to obtain

 FP∗(A−λB)Q=[s22α2r−122s11α1r−111∗]−λ[ccs22β2r−122s11β1r−111∗]. (6)

This unitary equivalence gives the right anti-triangular form with the poles and in the desired order.

So far we have assumed no special relationship between and . To finish the story we must show that in our two structured cases (6) is a congruence, that is . In preparation for this we go back to (4) and take complex conjugates of both equations to obtain

 X∗FA∗=FˇA∗Y∗,X∗FB∗=FˇB∗Y∗.

Further simple manipulations yield

 (A∗F)(FY−∗F)=(FX−∗F)(ˇA∗F),(B∗F)(FY−∗F)=(FX−∗F)(ˇB∗F). (7)

Now consider what these equations look like in the alternating case , (which also implies and ). Clearly we have

 (AF)(FY−∗F)=(FX−∗F)(ˇAF),(BF)(FY−∗F)=(FX−∗F)(ˇBF). (8)

Now consider the palindromic case . Inserting this equation into (7), we again get (8), just as in the alternating case. Thus, in either case, (8) holds. Noting that and are both unit lower triangular, and comparing (8) with (4), we see that the pair is a solution of (4). By uniqueness of the solution we deduce that , and in particular . Writing this as and inserting the decompositions and , we obtain

 QR=PS−∗F=(PF)(FS−∗F).

By uniqueness of the QR decomposition,

 Q=PF.

Making this substitution into (6), we obtain

 Q∗(A−λB)Q=FTA−λFTB=^A−λ^B,

where

 ^A=[α2s22r−122α1s11r−111∗]and^B=[β2s22r−122β1s11r−111∗].

Of course, this computation has some redundancy due to the symmetry. In terms of the original (big) matrices and , the congruence is , where is a core transformation built from the unitary matrix .

We emphasize that this procedure succeeds if the two poles are distinct. Of course there is nothing to be gained by interchanging two poles that are equal. In our application below we will use a move of this type to interchange two shifts and . In the palindromic (resp. alternating) case, (resp. ), and the equation just means that does not lie on the unit circle (resp. imaginary axis).

### Refinement of a move of type IIo

After the move of type IIo, the resulting pole pencil has the form

in principle. In practice the numbers that are supposed to be zero will not be exactly zero because of roundoff errors in the computation. If those numbers are small enough (a modest multiple of the unit roundoff), they can simply be set to zero without compromising stability. If, on the other hand, they are not small enough, a correction step can make them smaller. Now let’s rewrite the pencil to take the nonzero entries into account. At the same time we will recycle the notation, leaving off the hats for simplicity. We have

 A−λB=[ϵα2α1α12]−λ[ηβ2β1β12],

where and are tiny but not small enough to be set to zero.

The correction step is explained in detail, and in greater generality, in Section 6. Here we provide a brief description. We look for tiny corrections and such that

 [1y1][ϵα2α1α12][1x1]=[0ˇα2ˇα1ˇα12],

and similarly for . This yields a pair of equations

 ϵ+yα1+α2x+yα12x=0,η+yβ1+β2x+yβ12x=0,

which can be simplified by deleting the insignificant quadratic terms to yield the linear equations

 [α1α2β1β2][yx]=−[ϵη].

Since the poles and are distinct, this system has a unique solution. It is easy to show that in both the alternating and the palindromic cases, , so the contemplated transformation is in fact a congruence. To make a unitary congruence we do a QR decomposition

 [1x1]=QR.

Because is tiny,

is close to the identity matrix. The orthogonal congruence by

is the desired correction:

where and are now normally small enough to be set to zero. In the unlikely event that they are not, the correction step can be repeated.

### Move of Type IIe

Now consider the case when is even. There is an odd number of poles , …, . The three poles in the middle are , , and , where . Recall that the pole is a special unpaired pole, while and are linked by the structure (palindromic or alternating). We need a move that swaps with , while leaving where it is.

All of the action takes place in a subpencil

Again we borrow the symbols and to denote the relevant submatrices. The subpencil inherits the palindromic or alternating structure of the big pencil, but we will proceed at first as if and were unrelated. The poles are , , and . We want to make a unitary congruence transformation that swaps the positions of and while leaving in the middle. We proceed just as in the case of a swap. We flip the rows and columns to get

 FAF−λFBF=⎡⎢⎣α31α32α3α21α2α1⎤⎥⎦−λ⎡⎢⎣β31β32β3β21β2β1⎤⎥⎦,

which has the poles in the desired locations but the “wrong” triangularity. As in the case, we find it convenient to work with partially flipped forms

Next we set up and solve some Sylvester equations to diagonalize the pencil. Specifically, we will find unit lower triangular

 X=⎡⎢⎣1x211x31x321⎤⎥⎦andY=⎡⎢⎣1y211y31y321⎤⎥⎦

such that

 (AF)X=Y(ˇAF)and(BF)X=Y(ˇBF), (9)

where and are anti-diagonal matrices with the same anti-diagonals as and , respectively. This is the analog of (4). Writing the first of these equations out in detail, we have

 ⎡⎢⎣α1α21α2α31α32α3⎤⎥⎦⎡⎢⎣1x211x31x321⎤⎥⎦=⎡⎢⎣1y211y31y321⎤⎥⎦⎡⎢⎣α1α2α3⎤⎥⎦,

and similarly for the equation. Altogether this is a system of six linear equations in the six unknowns , , , , , and , but fortunately it turns out to be three systems of two equations that are nearly independent of one another:

 [α1α2β1β2][−y21−x21]=[α21β21],

which has a unique solution if and only if ,

 [α2α3β2β3][−y32−x32]=[α32β32],

which has a unique solution if and only if , and

 [α1α3β1β3][−y31−x31]=[α31+α32x21β31+β32x21],

which has a unique solution if and only if . We conclude that there are unique unit lower triangular and such that (9) holds if and only if the poles are distinct. We can easily and stably compute and . Rewriting (9) we have the equivalence

 (YF)−1(A−λB)(FX)=F(ˇA−λˇB)F=⎡⎢⎣α3α2α1⎤⎥⎦−λ⎡⎢⎣β3β2β1⎤⎥⎦,

which is the analog of (5). From here on the argument is exactly the same as in the case, starting from (5), with obvious trivial modifications. In the end we get a congruence that makes the desired swap. In this case is not a core transformation, as its active part is .

We emphasize that this move succeeds if the three poles in question are all distinct. In our application below, the poles will be , , . In the palindromic case, is an unpaired pole that must lie on the unit circle, and . Thus if and only if and are not on the unit circle. Thus, if , then all three poles are automatically distinct. The same is true in the alternating case by a similar argument involving the imaginary axis instead of the unit circle.

### Refinement of a move of type IIe

We showed above that a move of type IIo can be refined if necessary, and the same is true of a move of type IIe. After the move of type IIe, the resulting pole pencil has the form

We have simplified the notation by leaving off the hats. The numbers and are tiny and would have been zero except for roundoff errors. If they are not small enough to be ignored, we must do a refinement step. The “upside down” notation used here reflects the fact that in the analysis above, we flipped the rows of the matrices. We could do the same thing here (flip the rows), but for this brief summary we will not bother. For details see Section 6.

For the correction step we look for matrices

 X=⎡⎢⎣1x211x31x321⎤⎥⎦andY=⎡⎢⎣1y12y131y231⎤⎥⎦

that set the tiny numbers to zero:

 Y(A−λB)X=⎡⎢⎣00ˇα30ˇα2ˇα23ˇα1ˇα12ˇα13⎤⎥⎦−λ⎡⎢ ⎢⎣00ˇβ30ˇβ2ˇβ23ˇβ1ˇβ12ˇβ13⎤⎥ ⎥⎦. (10)

Since the corrections to be made are tiny, we expect all of the the numbers and to be tiny. Writing out (10) in detail we get a system of six quadratic equations in six unknowns. The quadratic terms and a few others are negligible. Eliminating negligible terms we get six linear equations that have a unique solution as long as the poles are distinct. See Section 6 for details. In both alternating and palindromic cases, the transforming matrices and are related by , which simplifies the situation further. In either case, the equations are easily and stably solved.

Once has been computed, the decomposition supplies the needed unitary transforming matrix. The congruence yields the desired refinement.

###### Remark 1

The moves of types IIo and IIe are special cases of constructions presented in (KrScWa08, , §§4.1, 8.1), but we have taken a different approach here. In KrScWa08 the palindromic and alternating cases were considered separately. Here we have looked at the unstructured case first and shown how to do the swap using a unitary equivalence. Then we have shown that in both of our structured cases the equivalence is actually a congruence. We have also added a refinement step, which was not contemplated in KrScWa08 , to make the algorithm more robust.

## 4 Stability

We can construct a variety of algorithms from the moves. If each move is backward stable, then any algorithm built from moves must also be backward stable. We therefore take a moment to consider this question. Standard backward error analysis

Wil65 shows that moves of type I are backward stable. If the moves of type II are implemented as shown in CaMaVaWa19 , they never fail and are always backward stable.

Moves of types IIo and IIe require the solution of small linear systems that are nonsingular if and only if the poles involved in the swap are distinct. As we will see below, there are other good reasons (involving convergence rates) for keeping these poles distinct and preferably far apart. Assuming this is done, we can expect these moves to be stable. There is a natural stability test associated with these moves: check that the numbers that are supposed to be zero really are (almost) zero. For the event that they are not, we have described a refinement step that can be used to make them smaller. The refinement can be repeated if necessary, though this should be vary rare. Because of the refinement step, we can say for sure that moves of types IIo and IIe are backward stable, provided that we do not attempt to swap two equal poles.

## 5 Building an algorithm using the moves

First suppose our pair has odd dimension , and its poles are , , …, , , …, , , where , and (resp. ) in the palindromic (resp. alternating) case. One iteration of the most basic algorithm would proceed as follows. First a shift is chosen and inserted in place of by a move of type I. This move also inserts a shift in place of at the other end. A simple choice of would be the Rayleigh quotient shift . Then a move of type II is used to interchange with and with . Then another move of type II is used to interchange with , and so on. After such moves, the poles will be , …, , , , , …, , with and side by side in the middle. Then a move of type IIo can be used to swap them, provided that . Then additional moves of type II are used to push and further along, that is, is swapped with , then , and so on, while is swapped with , , etc. Once the shifts arrive at the edge of the pencil, they can be removed by a move of type I, which would replace them by new poles, which could be the original poles , , or they could be different. This completes one iteration of the basic algorithm.

The case of even dimension is the same, except that there is an extra unpaired pole in the middle. Type II operations push the shifts and toward the middle, as in the odd case, until the configuration of poles is , …, , , , …, . Then a move of type IIe is used to swap and while leaving fixed. This can be done provided that . Once this exchange has been made, the iteration is completed in the same way as in the odd case.

Repeated iterations of the basic algorithm with good choices of shifts and will cause the pencil to tend toward triangular form, exposing the eigenvalues on the anti-diagonal. Not all eigenvalues appear at once. Good shifts can (usually) cause and in just a few iterations; the convergence rate is typically quadratic. By symmetry we must also have and at the same rate. This exposes a pair of eigenvalues and at the ends. Then the problem can be deflated to size , and we can go after the next pair of eigenvalues, and so on. All of this is a consequence of the convergence theorem stated below. We will not present a detailed explanation because the arguments are the same as in the unstructured case.

### A convergence theorem

The mechanism that drives all variants of Francis’s algorithm, including the QZ algorithm, is nested subspace iteration with changes of coordinate system. See (Wat10, , p. 431), (Wat11, , p. 399), or (AuMaRoVaWa18, , Theorem 2.2.3). This is also true of our basic algorithm sketched above. We just need to take a few lines to set the scene. We make the (generically valid) assumption that none of the poles or shifts is exactly an eigenvalue of the pencil . We continue to cover the alternating and palindromic cases simultaneously. Each iteration begins with the choice of a shift and a companion shift ( in the palindromic case and in the alternating case). The result of the iteration is a new structured anti-Hessenberg pencil satisfying

 ^A−λ^B=Q∗(A−λB)Q. (11)

We need to define two nested sequences of subspaces. For , …, , define

 Ek={\rm span}{e1,…,ek},

where , …, are the standard basis vectors. Then define

 Qk=QEk,

the space spanned by the first columns of .

###### Theorem 5.1

A single step of the basic algorithm described above with shift effects nested subspace iterations

 Qk=(A−~ρB)−1(A−ρB)Ek,k=1,…,n−1.

The change of coordinate system (11) transforms back to .

This theorem makes no mention of convergence, but we call it a convergence theorem anyway. This result and ones like it can be used together with the convergence theory of subspace iteration to draw conclusions about the convergence of the algorithm, as explained in Wat07 ; Wat10 ; Wat11 and elsewhere.

###### Proof

We sketch the proof, relying on Theorem 5.2 of CaMaVaWa19 . That theorem applies to upper-Hessenberg pencils, so we can apply it to the flipped pencil . Notice that .

First suppose is odd and . According to (the “” part of) Theorem 5.2 of CaMaVaWa19 , the action on depends on the two moves of type II that take place at the “th position”, by which we mean the spot originally occupied by poles and . The basic algorithm inserts the shift , swapping it with , …, . The first swap at the th position is an exchange of with , which (see (CaMaVaWa19, , Theorem 5.2)) generates a factor . The only other swap at the th position happens when is swapped with , moving back to its original position. This introduces a factor . The action on is determined by the product of the factors, which is . Specifically, is transformed to . Since , we have , as claimed.

The case is the same, except that the order of the swaps is reversed. This does not change the outcome.

The case is different because this is the spot in the middle where only a single swap takes place, interchanging the shifts and . Using (CaMaVaWa19, , Theorem 5.2) again, we see that we just have a single factor , so the result is the same as in the other cases. This case can also be deduced directly from (CaMaVaWa19, , Theorem 4.3), which is a precursor of (CaMaVaWa19, , Theorem 5.2).

In the case of even , the argument is the same as in the odd case if or . The only question mark is in the cases and , which are affect by (and only by) the move of type IIe in the middle. Since this move, which affects three adjacent poles instead of two, is different from a standard move of type II, we must check what happens here. We leave it to the reader to verify that the proof of (CaMaVaWa19, , Theorem 4.3), which applies to moves of type II, is also valid for moves of type IIe. Once this has been checked, the proof is complete.

In our discussion of moves of type IIo and IIe we had to make the assumption in order to ensure that the moves are possible. Theorem 5.1 gives us another reason for this assumption: If , we have , and the iteration goes nowhere. The action of the shift traveling in one direction is exactly cancelled by the shift traveling in the opposite direction.

In the alternating case the requirement means that the shifts should not lie on the imaginary axis. It follows that this method will only be useful for finding eigenvalues that are not purely imaginary. This is not necessarily a weakness. For the continuous-time optimal control problems mentioned in the introduction Meh91 , mild assumptions guarantee that no eigenvalues are on the imaginary axis; exactly half are in the open left half plane and half are in the open right half plane. Our basic algorithm will have no problem computing all of these eigenvalues; they will be extracted in pairs.

In the palindromic case the requirement means that the shifts should not lie on the unit circle, so this method will only be useful for finding eigenvalues that have modulus different from 1. For the discrete-time control problems mentioned above, mild assumptions guarantee that no eigenvalues lie on the unit circle; exactly half are inside and half are outside. Our algorithm will easily compute all of these eigenvalues, and they will be extracted in pairs.

The bulge-chasing algorithm in KrScWa08 has the same () restriction. In fact we can show that our basic algorithm is a generalization of the algorithm in KrScWa08 .222In KrScWa08 we considered multi-shift bulge-chasing algorithms of arbitrary degree. Here we are considering only a single-shift algorithm, and this generalizes the single-shift version of the algorithm in KrScWa08 .

Let’s take a look at this algorithm, beginning with the palindromic case. The pencil is with assumed to be anti-Hessenberg, i.e.

The algorithm presented in KrScWa08 begins with a reduction step that introduces some zeros above the anti-diagonal, transforming to a partially triangular form

A bulge-chasing algorithm is then applied to this partially reduced form to expose the eigenvalues satisfying in pairs.

From our current vantage point we can see that the preliminary reduction step is just a process of introducing zero and infinite poles into the pencil by moves of types I and II. This step is necessary for the bulge-chasing algorithm in KrScWa08 , but it is not needed for the pole swapping algorithm discussed in this paper; we can go to work right away on the anti-Hessenberg pencil .

The algorithm in KrScWa08 for the alternating case is a little bit different. Starting with an alternating pencil in anti-Hessenberg form, this algorithm requires a preliminary step to transform to anti-triangular form, leaving anti-Hessenberg. Then a bulge-chasing algorithm is applied. The modified pencil has all poles equal to infinity. Again, from our new viewpoint, we can see that the preliminary reduction is nothing but a process of introducing infinite poles by moves of types I and II. This is necessary for the bulge-chasing algorithm in KrScWa08 , but it is not needed for our pole-swapping algorithm.

For a single-shift iteration of the algorithm in KrScWa08 we have a theorem just like Theorem 5.1. If we introduce a shift at one end, we automatically introduce a complementary shift ( or ) as always, at the other end. The setup is the same as for Theorem 5.1, and the new result looks like this:

###### Theorem 5.2

A single step of the algorithm in KrScWa08 with shift effects nested subspace iterations

 Qk=(A−~ρB)−1(A−ρB)Ek,k=1,…,n−1.

The change of coordinate system (11) transforms back to .

This is an immediate consequence of (Wat07, , Theorem 7.3.1). Note that Theorem 5.2 is identical to Theorem 5.1. The only difference is that the bulge-chasing algorithm requires a special Hessenberg-triangular form, as described immediately above. Since the action of the algorithms is the same, we deduce that the pole-swapping algorithm is a generalization of the bulge-chasing algorithm.

###### Remark 2

For clarity we have focused on the most basic possible pole-swapping algorithm for the palindromic and alternating problems. One can consider variants that introduce multiple shifts and draw the same conclusions. For more ideas see CaMaVaWa19 .

## 6 Justification of the refinement step

In Section 3 we briefly described refinement procedures that can be applied (occasionally) after moves of types IIo and IIe. Here we provide complete justifications for those procedures. The algorithm in KrScWa08 also has a move in the middle to which a refinement step could be applied. In that paper it was acknowledged that a failure might occasionally occur, but it was reported that in the course of the various tests, no failures were observed. However, since a failure might occur at any time, it would make sense to add the refinement step to the algorithm of KrScWa08 to make it more robust. In order to accommodate its use in the context of KrScWa08 , we have made our discussion of the refinement procedure more general than is strictly required for this paper.

First we consider the case of no middle pole(s), as in a move of type IIo. Suppose we have just swapped poles with other poles. After the swap we have a bulge pencil

 [E11A12A21A22]−λ[G11B12B21B22],

where the submatrices are . The matrices and would be zero if there were no roundoff errors, so and . We assume that the eigenvalues (the poles) of the subpencil are disjoint from those of . In the case of a move of type IIo we have , but we are now allowing larger to take into account the scenario of KrScWa08 .

If and are not small enough, we must do a refinement step. To this end we seek and such that

 [IYI][E11A12A21A22][IXI]=[0ˇA12ˇA21ˇA22],[IYI][G11B12B21B22][IXI]=[0ˇB12ˇB21ˇB22]. (12)

By straightforward computation we find that (12) holds if and only if the algebraic Riccati equations

 A12X+YA21+E11+YA22X=0B12X+YB21+G11+YB22X=0 (13)

hold. Since and are tiny, we expect that the corrections and will be tiny as well. Therefore the quadratic terms and in (13) should be negligible, and (13) should be well approximated by the Sylvester equations

 A12X+