The standard algorithm for computing the eigenvalues of a small to medium-sized non-Hermitian matrix is still Francis’s implicitly-shifted QR algorithm [15, 31]. In many applications eigenvalue problems arise naturally as generalized eigenvalue problems for a pencil , and for these problems the Moler-Stewart variant of Francis’s algorithm , commonly called the QZ algorithm, can be used. In this paper we may refer sometimes to a pencil and other times to a pair . Either way, we are talking about the same object.
A few years ago we published a generalization of the QZ algorithm . More recently an even more general algorithm, the rational QZ (RQZ) algorithm, was presented by Camps, Meerbergen, and Vandebril . This arose from the study of rational Arnoldi methods and is related to work of Berljafa and Güttel .
In this paper we discuss the RQZ algorithm and introduce several variants. We develop a new modular (and therefore more flexible) convergence theory that can be applied immediately to all variants.
A key component of the RQZ and related algorithms is a procedure that swaps two adjacent eigenvalues in a triangular matrix. We present an improved swapping routine and demonstrate its superiority by numerical experiments and a backward error analysis.
2 Hessenberg pairs
A matrix is in (upper) Hessenberg form if every entry below the first subdiagonal is zero. It is in proper Hessenberg form if every subdiagonal entry is nonzero, i.e. for , …, . A preliminary step for the algorithm is to reduce the pair to Hessenberg-triangular form. That is, is transformed by a unitary equivalence to a new pair for which is upper Hessenberg and is upper triangular. Notice that if is not properly Hessenberg, the eigenvalue problem can be split immediately into two or more independent subproblems. Thus we can always assume that we are dealing with a matrix in proper Hessenberg form.
In the new theory we deal with a more general class of Hessenberg pencils. The pair is called a Hessenberg pair if both and are Hessenberg matrices. If for some , we can immediately split the eigenvalue problem into two smaller problems. We therefore eliminate that case from further consideration. For reasons that will become apparent later, the ratios , , …, are called the poles of the Hessenberg pair . In the case , we have an infinite pole. The Hessenberg-triangular form is a special Hessenberg pair for which all of the poles are infinite.
Closely related to is the pole pencil obtained from by deleting the first row and last column. The pole pencil is upper triangular, and its eigenvalues are obviously the poles of .
Operations on Hessenberg pairs
for short) to be a unitary matrix that acts only on two adjacent rows/columns, for example,
where the four asterisks form a unitary matrix. Givens rotations are examples of core transformations. Our core transformations always have subscripts that tell where the action is: acts on rows/columns and .
Following  we introduce two types of operations, or moves, both of which manipulate the poles in the pencil. Let , …, denote the poles of the Hessenberg pair .
Changing a pole at the top or bottom. (Type I move)
We can change the pole to any value we want by applying a core transformation to the pencil on the left. Suppose we want to change to , say. Noting that only the first two entries of can be nonzero, we deduce that there is a such that the second entry of is zero. In other words, for some . If we then define and , then , which implies that . This means that is the new first pole of . The other poles remain fixed, as they are untouched by the transformation. (The core defined here is exactly the same as the transformation that is used to initiate an iteration of the standard QZ algorithm with shift .)
This operation fails only if , yielding . This happens exactly when the first columns of and are proportional. But this is not such a failure after all, as it exposes as an eigenvalue of the pencil and allows us to deflate to a smaller problem by deleting the first row and column.
In summary, if we want to replace pole by , we will either succeed in doing so or get a deflation of an eigenvalue of .
When we write something like here and elsewhere, this should be viewed as shorthand for where and are any scalars for which . As a practical matter this allows us to use modest sized and even when is very large, and in particular it allows us to implement the case by taking .
The pole at the bottom can also be replaced by any other pole, say , by a similar procedure. We want to transform the pencil to with
. Noting that the row vectorhas nonzero entries only in its last two positions, we see that there must be a core transformation that maps it to a multiple of , i.e. for some . This is the desired transformation, since it implies , which is equivalent to .
This fails only if , yielding . But again this is not really a failure at all, since it allows to be extracted as an eigenvalue and the problem to be deflated to a smaller one.
 The core transformation that replaces pole by satisfies
for some nonzero .
From our construction we have . Since is the first pole of the pair , we have for some nonzero . Therefore , where .
The insertion of the extra factor may seem mysterious. As we shall see later, this is just what is needed for a consistent convergence theory. In the product , the factor signals that the pole is entering the pencil, while the factor signals that the pole is leaving.
 The core transformation that replaces pole by satisfies
for some nonzero .
From our construction we have . Since is the last pole of the pair , we have for some nonzero . Therefore , where .
The arithmetic cost of a move of type I is just the cost of multiplying and by a single core transformation, or . If the cores are Givens rotations applied in the conventional way, the cost is about multiplications and additions, or flops. Different implementations could yield slightly different flop counts, but regardless of the details the cost will be .
Standard backward error analysis  shows that moves of type I are backward stable.
Interchanging two poles. (Type II move)
The second of the two allowed operations is to interchange two adjacent poles by a unitary equivalence . To understand this, consider the pole pencil obtained by discarding the first row and last column from . This pencil is upper triangular and has , …, as its eigenvalues. There are standard techniques , , [29, §§ 4.8, 6.6] for interchanging any two adjacent eigenvalues and . We will describe an improved method in Section 8. Each of these requires only an equivalence transformation by two core transformations and of dimension . We then enlarge these matrices by adjoining a row and column to the top of and the bottom of :
Then is the desired transformation.
If the swap is done as described by Van Dooren , the procedure always succeeds and is backward stable in the sense that the backward error incurred is a small multiple of , where is the unit roundoff. We have developed a new swapping procedure that has an improved backward error. To wit, the backward errors in and separately are small multiples of and , respectively. This makes a real difference in the performance of the algorithm in cases where and differ greatly in magnitude. In order not to interrupt the flow of the paper, we defer the description of the new swapping procedure to Section 8. Here and throughout the paper the norm symbol refers to either the vector 2-norm or matrix 2-norm, depending on the context.
The flop count for a move of type II is about the same as for a move of type I, namely if the core transformations
are implemented as Given rotations. In any event, the flop counts for moves of type I and II are about the same,
and each move costs
flops.111 This is the correct count for the case when only eigenvalues are being computed. If eigenvectors
or some deflating subspaces are wanted as well, the transforming matrices
This is the correct count for the case when only eigenvalues are being computed. If eigenvectors or some deflating subspaces are wanted as well, the transforming matricesand also need to be updated on each move. This adds about flops for a type I move and flops for a type II, but the total is still .
We have one type of move that is able to change a pole at one end or the other and another type that swaps poles in the middle. It is natural to ask whether we can devise a move that changes a pole in the middle. The answer seems to be no; at least we have the following result. Consider a transformation
where does not touch the first row and does not touch the last column. That is,
Under any such transformation the poles must remain invariant. This is so because the transformation (1) is equivalent to a transformation on the pole pencil. Since the poles of are the eigenvalues of the pole pencil, they must remain fixed.
Thus any transformation meant to change a pole must touch either the first row or the last column. That’s what the moves of type I do.
3 Building an Algorithm from the Pieces
Suppose we want to find the eigenvalues of some regular pair . As usual, there are two steps to the process. The first is a direct method that transforms to a condensed form, in our case a Hessenberg pencil. The second step is an iterative process that uncovers the eigenvalues of the condensed form.
First we need one more definition. A Hessenberg pair is called a proper Hessenberg pair if three conditions hold: (i) For all , , (ii) the first columns of and are not proportional, (iii) the last rows of and are not proportional. The first condition just says that for each , at least one of and is nonzero. If this condition is not satisfied, we can immediately reduce the pencil to two smaller pencils. If either of conditions (ii) and (iii) is not satisfied, we can also reduce the problem, as we know from the discussion of moves of type I. Therefore, we can always assume without loss of generality that we are working with a proper Hessenberg pair.
Reduction to a Hessenberg Pencil
Moler and Stewart  showed how to reduce to Hessenberg-triangular form by a direct method in flops. The reduction is also described in [16, 29, 30] and elsewhere. If the resulting pair is not proper, we can split it into smaller proper pairs, so let us assume it is proper. This is a Hessenberg pencil with all poles equal to . If the user is happy to start from this configuration, s/he can move directly to the iterative phase.
If the user wants to set certain prescribed poles , …, before beginning the iterations, that is also possible. One obvious procedure is to begin by introducing at the top of the pencil by a move of type I. Then can be swapped with each of the remaining infinite poles by moves of type II until it arrives at its desired position at the bottom. The total number of moves is . Then can be introduced at the top by a move of type I. It can then be swapped with each of the remaining infinite poles until it arrives at its desired position just above . The total number of moves for this step is . Then can be introduced, and so on. Eventually we get each of , …, into its desired position. The total number of moves for this phase is about , and the total flop count is .
One can equally well introduce the poles at the bottom and swap them upward, starting with , then , and so on. The amount of work is exactly the same, about moves. Better yet, one can take and introduce , …, (in reverse order) at the top and , …, at the bottom. This cuts the number of moves in half. However one does it, the cost is .
Camps, Meerbergen, and Vandebril  describe a procedure that introduces the poles during the reduction to Hessenberg form. They also present an example where a good choice of poles induces a deflation in the middle of the pencil.
The Iterative Phase (Basic Algorithm)
We suppose we have a proper Hessenberg pair with poles , …, . We now describe an iteration of the RQZ algorithm proposed in . We will call this the basic algorithm.
First a shift is chosen. Any of the usual shifting strategies can be employed here. The simplest is the Rayleigh-quotient shift . Then is introduced as a pole at the top of the pencil, replacing , by a move of type I. Next is swapped with by a move of type II. Then another move of type II is used to swap with , and so on. After moves of type II, arrives at the bottom of the pencil. The poles are now , …, , and . Finally a move of type I is used to remove the pole from the bottom, replacing it by a new pole . This completes the iteration. The user has complete flexibility in the choice of . One possibility is . Another, which might be called a Rayleigh-quotient pole, is .
The cost of one iteration of the basic algorithm is moves or flops. With any of the standard shifting strategies, e.g. Rayleigh-quotient shift, repeated iterations will normally cause rapid convergence of an eigenvalue at the bottom of the pencil. Typically and quadratically, leaving as an eigenvalue and allowing deflation of the problem. After deflations, all of the eigenvalues will have been found.
There are numerous variations on the basic algorithm. For example, it can be turned upside down. We can pick a shift, say , insert it at the bottom of the pencil, and chase it to the top. Since we can do this, then why not chase shifts in both directions at once? Some possibilities along these lines will be discussed in Section 6.
4 Convergence theory
In the convergence theorems in this paper we make the blanket (and generically valid) assumption that none of poles or shifts that are mentioned are eigenvalues of the pencil. We often find it convenient to assume that is nonsingular.
The mechanism that drives all variants of Francis’s algorithm is nested subspace iteration with changes of coordinate system [30, p. 431], [31, p. 399], [1, Thm 2.2.3]. As a specific example, let us consider a single step of the (single-shift) algorithm with shift applied to a Hessenberg-triangular pencil , yielding a new pencil with
First we define some nested sequences of subspaces. For , …, , define
where , …, are the standard basis vectors. Then define
Thus (resp. ) is the space spanned by the first columns of (resp. ).
A single step of the algorithm with shift effects nested subspace iterations
The change of coordinate system (2) transforms both and back to .
We call this a convergence theorem even though it makes no mention of convergence. Theorems like this can be used together with the convergence theory of subspace iteration to draw conclusions about the convergence of the algorithm, as explained in [29, 30, 31] and elsewhere.
Camps, Meerbergen, and Vandebril [13, Thm. 6.1] proved a result like Theorem 4 for the basic algorithm. The scenario is similar. The iteration begins with a proper Hessenberg pair with poles , …, , employs a shift , and ends with a new proper Hessenberg pair with poles , …, . The old and new pairs are related by a unitary equivalence transformation of the form (2).
A single step of the basic algorithm with shift , starting with a proper Hessenberg pair with poles , …, and ending with with poles , …, effects nested subspace iterations
The change of coordinate system (2) transforms both and back to .
This theorem was proved in , but we will also provide a proof based on our new theory later on. Comparing this with Theorem 4, we see that the inclusion of poles gives extra freedom that might be used to improve convergence.
Now consider Theorem 4 in the case when all of the poles are infinite. When , the operator becomes (when appropriately rescaled) . Similarly becomes . Noting that these operators are exactly the ones that appear in Theorem 4, we draw the following conclusion: The action of the basic algorithm in the case of infinite poles is exactly the same as that of the algorithm. This fact implies that the transforming matrices and for the two algorithms are essentially the same, as the reader can easily check.
It is therefore reasonable to view the basic algorithm (with arbitrary poles) as a generalization of the algorithm. However, there is an important difference in their implementation. The algorithm acts on proper Hessenberg-triangular pencils. It is a bulge-chasing algorithm. The initial equivalence transformation of each iteration creates a bulge in the Hessenberg-triangular form. The rest of the iteration consists of equivalence transformations that chase the bulge back and forth between and until it finally disappears off of the bottom of the pencil. At that point the Hessenberg-triangular form has been restored and the iteration is complete. The algorithm can also be implemented as a core-chasing algorithm, as is shown in  and , but the situation is the same: The Hessenberg-triangular form is disturbed at the beginning of the iteration and not restored until the very end.
Now let us contrast this with what happens in the basic algorithm (with infinite poles or otherwise). The basic algorithm operates on proper Hessenberg pairs, in which neither matrix is required to be triangular. Each iteration starts with a move of type I, performs a sequence of moves of type II, and ends with a move of type I. These moves do not disturb the Hessenberg form; it is preserved throughout. We don’t have to wait until the end of an iteration to get proper Hessenberg form. This implies that we can think of each move as a “mini iteration” and ask whether we can obtain a result like Theorem 4 or 4 for each individual move of type I or II. It turns out that we can.
Each move of either type is an equivalence transform of the form
The case denotes a move of type I, and we have . The case also denotes a type I move, and in this case . The cases , …, are of type II. Suppose has poles , …, . A move of type II interchanges poles and . For the moves of type I, in the case suppose the pole is replaced by a new pole , and in the case suppose is replaced by a new pole . With this notation we can cover both types of move by a single theorem.
As above we define sequences of nested subspaces and , where (resp. ) is the space spanned by the first columns of (resp. ). But note that, because and are core transformations, these spaces are mostly trivial in this setting: except when , and except when .
Using notation and terminology established directly above, the move
effects nested subspace iterations that are, however, mostly trivial. The nontrivial actions are
The change of coordinate system (3) transforms back to and back to .
The proof of Theorem 4 makes use of rational Krylov subspaces. Given and , the standard Krylov subspaces are defined by
Given an ordered set of poles , none in the spectrum of , the rational Krylov subspaces are defined by
and in general
Given a pair with nonsingular, we define rational Krylov subspaces
associated with the pair by
, …, . We have assumed for convenience that is nonsingular. See  for a definition of these spaces that does not require this assumption. We are using the symbol to denote several different types of Krylov subspaces. The meaning in each case is uniquely determined by the number and type of arguments.
We will make use of the following result, which is Theorem 5.6 in .
Let be a proper upper Hessenberg pair with poles . Let , as before. Then for , …, ,
Now consider . The transformation interchanges poles and , so the ordered pole set of is
Applying Proposition 4 to we have
Using and the abbreviation , we get
This last span is generated only from positive powers of , so the shifts are irrelevant. It is just the standard Krylov subspace . In particular, we can replace the shift by everywhere in the span without changing the space, so
Using Proposition 4 again, and noting that , we get the desired result .
In this argument we have assumed that exists. However, the result also holds for singular by a continuity argument.
Now consider the spaces . In the case we have . Substituting and solving for , we have . The ordered pole set for is , so for some nonzero . Similarly for some nonzero . Therefore . This proves that
For we have . Arguing just as we did for , we have
Using , and making the abbreviation , we have
obtained by noting that if and only if . We could equally well have derived the dual results first and then deduced Theorem 4. In that case we would use Proposition 2 to prove the case , and not use Proposition 2 at all. From Proposition 2 with we have immediately
the case of the dual result.
5 Using Theorem 4
In all of the convergence theorems of the previous section we have actions of the form and , where is a rational function, e.g. . In the following lemma the functions and can be any functions defined on the spectrum of the pencil , but in our applications they will always be rational. In this case, being defined on the spectrum of just means that none of the poles are eigenvalues.
Consider two successive changes of coordinate system
For , …, , if
where is the pointwise product of and . If