Bohemian Upper Hessenberg Matrices

09/27/2018 ∙ by Eunice Y. S. Chan, et al. ∙ Universidad de Alcalá Universidad de Cantabria Western University 0

We look at Bohemian matrices, specifically those with entries from {-1, 0, +1}. More, we specialize the matrices to be upper Hessenberg, with subdiagonal entries ±1. Many properties remain after these specializations, some of which surprised us. We find two recursive formulae for the characteristic polynomials of upper Hessenberg matrices. Focusing on only those matrices whose characteristic polynomials have maximal height allows us to explicitly identify these polynomials and give a lower bound on their height. This bound is exponential in the order of the matrix. We count stable matrices, normal matrices, and neutral matrices, and tabulate the results of our experiments. We prove a theorem about the only possible kinds of normal matrices amongst a specific family of Bohemian upper Hessenberg matrices.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

page 13

page 20

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A matrix family is called Bohemian if its entries come from a fixed finite discrete (and hence bounded) set, usually integers. The name is a mnemonic for Bounded Height Matrix of Integers. Such populations arise in many applications (e.g. compressed sensing) and the properties of matrices selected “at random” from such families are of practical and mathematical interest. For example, Tao and Vu have shown that random matrices (more specifically real symmetric random matrices in which the upper-triangular entries , and diagonal entries are independent) have simple spectrum [23]. An overview of some of our original interest in Bohemian matrices can be found in [16].

Bohemian families have been studied for a long time, although not under that name. For instance, Olga Taussky-Todd’s paper “Matrices of Rational Integers” [24] begins by saying

“This subject is very vast and very old. It includes all of the arithmetic theory of quadratic forms, as well as many of other classical subjects, such as latin squares and matrices with elements or which enter into Euler’s, Sylvester’s or Hadamard’s famous conjectures.”

The paper [19]

by C. W. Gear is another instance. What is new here is the idea that these families are themselves interesting objects of study, and susceptible to brute-force computational experiments as well as to asymptotic analysis. These experiments have generated many conjectures, some of which we resolve in this paper. Others remain unsolved, and are listed on the Characteristic Polynomial Database 

[25]. Many of the conjectures have a number-theoretic or combinatorial flavour.

Typical computational puzzles arise on asking simple-looking questions such as “how many matrices with the population111The population of a Bohemian family is the set of permissible entries.

are singular.” The answer is not known as we write this, although we can give a probabilistic estimate (

after sample determinants2224103732 singular matrices out of twenty million sampled.): brute computation seems futile because there are such matrices. We do know the answers up to size five by five: The number of by singular matrices with population is, for , , , , and , just , , , , and . This represents fractions of their numbers () of , , , , and , respectively.

Yet such matrix families are both useful and interesting. For instance, one may use discrete optimization over a family to look for improved growth factor bounds [20]. Matrices with the population have minimal height333 is the largest absolute value of any entry in . over all integer matrices; finding a matrix in this family which has a given polynomial as characteristic polynomial identifies a so-called “minimal height companion matrix”, which may confer numerical benefits.

Recently the study of eigenvalues of structured Bohemian matrices (e.g. tridiagonal, complex symmetric) has been undertaken and several puzzling features are seen resulting from extensive experimental computations. For instance, some of the images at

bohemianmatrices.com/gallery show common features including “holes”.

Different matrix structures produce remarkably different pictures. One structure useful in eigenvalue computation is the upper Hessenberg matrix, which means a matrix such that if . These arise naturally in eigenvalue computation because the QR iteration is cheaper for matrices in Hessenberg form. Results on the determinants of Hessenberg matrices can be found in [21].

Remark 1

on computing eigenvalues by first computing characteristic polynomials. Numerical analysts are familiar with the superior numerical stability of computing eigenvalues iteratively, usually by the QR algorithm or some variant, rather than first computing characteristic polynomials and then finding roots. As is well-known, such an algorithm is numerically unstable because polynomials are usually badly-conditioned while eigenvalues are usually well-conditioned444This has been well-known to the point of folklore since the work of Wilkinson. The well-conditioning of eigenvalues has only recently been quantified in some cases, but for instance the results of [3] do confirm the folklore.. Somewhat surprisingly, for several families of Bohemian matrices, characteristic polynomials become valuable again: first because the matrix dimensions are typically small or at most moderate, the ill-conditioning does not matter much, and second because for some families (not all!) the number of distinct characteristic polynomials is vastly smaller than the number of matrices in the family. For instance, for the general five by five matrices with population , there are nearly one trillion such matrices, but fewer than two million characteristic polynomials. This compression is significant.

For other families of matrices, such as upper Hessenberg Toeplitz matrices, there is no compression at all because each matrix has a distinct characteristic polynomial. Circulant matrices fall between, having fewer characteristic polynomials but not vastly fewer. The lesson is that for some questions (though not others), prior computation of characteristic polynomials is valuable.

We begin our study in this paper by considering determinants of Bohemian upper Hessenberg matrices. We prove two recursive formulae for the characteristic polynomials of upper Hessenberg matrices555We do not claim originality; recursion relations for upper Hessenberg determinants are known.. For another recursive formula we refer to [17]. During the course of our computations, we encountered “maximal polynomial height” characteristic polynomials when the matrices were not only upper Hessenberg, but Toeplitz ( constant along diagonals ); we have several results for such matrices, which will appear in [11]. Further restrictions to this class allowed identification of key results including explicit formulae for the characteristic polynomials of maximal height. In what follows, we lay out definitions and prove several facts of interest about characteristic polynomials and their respective height for these families.

In Figure 1 we see all eigenvalues of upper Hessenberg matrices, subdiagonals fixed at , with population . We denote this set of matrices . There are such matrices. We see a wide octagonal shape. The width of the figure reflects that some matrices might have diagonals , while some have diagonals , and others have diagonals . Of course mixed diagonals are also possible, but this should only tend to push things towards the centre.

This thinking motivates considering the subset of these matrices which has diagonal fixed at . We denote this set of matrices . There are substantially fewer such matrices, only to be exact, and their eigenvalues are pictured in Figure 2, with certain zones enhanced. We see that, roughly speaking, Figure 1 is partially explained by saying that, along with other eigenvalues, it contains three copies of Figure 2 placed with centres at , at , and at .

In this paper we seek to explain some of the features of these pictures, and to learn some things about these families of Bohemian matrices.

Figure 1: The set of eigenvalues of all six by six upper Hessenberg matrices with entries , and for . A more detailed image can be found at assets.bohemianmatrices.com/gallery/UH_6x6.png
Figure 2: The set of eigenvalues of all matrices in ; that is, six by six upper Hessenberg matrices with entries , diagonal entries fixed as zero, and for . A more detailed image can be found at assets.bohemianmatrices.com/gallery/UH_0_Diag_6x6.png

2 Prior Work

Visible features of graphs of roots and eigenvalues from structured families of polynomials and matrices have been previously studied. One well-known polynomial whose roots produce interesting pictures is the Littlewood polynomial,

(1)

where . These polynomials have been studied in [1], [4], [5], and [6]. The image of their roots raises many questions, ranging from whether the set is (ultimately, as ) a fractal and what the boundary of the set is, to questions about the holes in the image and its connection to various properties, such as degree and coefficients of the polynomial. Answers to some of these questions, particularly the ones involving the holes, have been shown to have some significance in number theory [2]. Roots of other polynomials have also been visualized; for more, see Christensen’s666https://jdc.math.uwo.ca/roots/ and Jörgenson’s777http://www.cecm.sfu.ca/~loki/Projects/Roots/ web pages.

Corless used a generalization of the Littlewood polynomial (to Lagrange bases). In his paper [12], he gave a new kind of companion matrix for polynomials expressed in a Lagrange basis. He used generalized Littlewood polynomials as test problems for the algorithm.

“The Bohemian Eigenvalue Project” was first presented as a poster [15] at the East Coast Computer Algebra Day (ECCAD) 2015. The poster focused on preliminary results and many of the questions raised when visualizing the distributions of Bohemian eigenvalues over the complex plane. In particular, the poster focused on “eigenvalue exclusion zones” (i.e. distinct regions within the domain of the eigenvalues where no eigenvalues exist), computational methods for visualizing eigenvalues, and some results on eigenvalue conditioning over distributions of random matrices.

In Chan’s Master’s thesis [7], she extended Piers W. Lawrence’s construction of the companion matrix for the Mandelbrot polynomials [14, 13] to other families of polynomials, mainly the Fibonacci-Mandelbrot polynomials and the Narayana-Mandelbrot polynomials. What is relevant here about this construction is that these matrices are upper Hessenberg and contain entries from a constrained set of numbers: , and therefore fall under the category of being Bohemian upper Hessenberg. Both the Fibonacci-Mandelbrot matrices and Narayana-Mandelbrot matrices are also Bohemian upper Hessenberg, but the set that the entries draw from is . At the time of submission for Chan’s Master’s thesis, the largest number of eigenvalues successfully computed (using a machine with 32 GB of memory) were , , and for the Mandelbrot, Fibonacci-Mandelbrot, and Narayana-Mandelbrot matrices, respectively. This makes the 16 Mandelbrot matrix the “largest” Bohemian matrix that we have solved at the time we write this paper.

These new constructions led Chan and Corless to a new kind of companion matrix for polynomials of the form . A first step towards this was first proved using the Schur complement in [8]. Knuth then suggested that Chan and Corless look at the Euclid polynomials [9], based on the Euclid numbers. It was the success of this construction that led to the realization that this construction is general, and gives a genuinely new kind of companion matrix. Similar to the previous three families of matrices, the Euclid matrices are also upper Hessenberg and Bohemian, as the entries are comprised from the set . In addition, an interesting property of these companion matrices is that their inverses are also Bohemian with the same population, a property which we call “the matrix family having rhapsody [10].”

As an extension of this generalization, Chan et al. [10] showed how to construct linearizations of matrix polynomials, particularly of the form , , (when , and , using a similar construction.

3 Notation

In what follows, we present some results on upper Hessenberg Bohemian matrices of the form

(2)

with , usually (we do not allow zero subdiagonal entries, because that reduces the problem to smaller ones) and for . We denote the characteristic polynomial .

Definition 1

The set of all Bohemian upper Hessenberg matrices with upper triangle population and subdiagonal population from a discrete set of roots of unity, say where is some finite set of angles, is called . In particular, is the set of all Bohemian upper Hessenberg matrices with upper triangle entries from and subdiagonal entries equal to and is when the subdiagonals entries are .

It will often be true that the average value of a population will be zero. In that case, matrices with trace zero will be common. It is a useful oversimplification to look in that case at matrices whose diagonal is exactly zero.

Definition 2

For a population such that , let be the subset of where the main diagonal entries are fixed at 0.

4 Results of Experiments

The methods used for computing the characteristic polynomials and counting the number of eigenvalues presented in Tables 19 in this section will be discussed in detail in a forthcoming paper. Many of the smaller-dimension computations were done directly in Maple 2017; for instance, computation of the characteristic polynomials of all two million or so matrices in took about six hours on a Surface Pro. The greater number of higher-dimension matrices, or matrices with larger populations, required special techniques and larger & faster machines. Eigenvalue computations were also done in Matlab and in Python. The computed characteristic polynomials are available through the Characteristic Polynomial Database [25].

#matrices #cpolys #neutral polys #neutral matrices
2 27 16 2 4
3 729 166 3 24
4 59,049 3,317 7 332
5 14,348,907 133,255 11 9,909
6 10,460,353,203 10,872,459 25 696,083
Table 1: Some properties of matrices in . The #matrices column reports the number of distinct matrices at each dimension. The #cpolys column reports the number of distinct characteristic polynomials at each dimension. The #neutral polys reports the number of characteristic polynomials where all roots have zero real part. The #neutral matrices column reports the number of matrices where all eigenvalues have zero real part.
#matrices #cpolys #neutral polys #neutral matrices
2 3 3 2 2
3 27 15 3 6
4 729 140 7 66
5 59,049 2,297 11 1,069
6 14,348,907 67,628 25 45,375
7 10,460,353,203 3,606,225 45 4,105,977
Table 2: Some properties of matrices in . The #matrices column reports the number of distinct matrices at each dimension. The #cpolys column reports the number of distinct characteristic polynomials at each dimension. The #neutral polys reports the number of characteristic polynomials where all roots have zero real part. The #neutral matrices column reports the number of matrices where all eigenvalues have zero real part.
multiplicity 1
2 5 1
3 35 0 1
4 431 5 0 1
5 9,497 9 3 0 1
6 363,143 51 5 1 0 1
Table 3: Number of distinct eigenvalues of various multiplicities of matrices in . Most eigenvalues are simple. It turns out that every multiple eigenvalue also occurs as a simple eigenvalue for some other matrix. The only -multiple eigenvalue of the class of by matrices is, of course, .
#matrices #cpolys #distinct real #neutrals polys #neutral matrices
2 8 6 6 1 1
3 64 28 25 1 1
4 1,024 197 219 1 1
5 32,768 2,235 3,264 1 1
6 2,097,152 39,768 75,045 1 1
7 268,435,456 1,140,848 2,694,199 1 1
Table 4: Some properties of matrices in . The #matrices column reports the number of distinct matrices at each dimension. The #cpolys column reports the number of distinct characteristic polynomials at each dimension. The #distinct real column reports the number of distinct real eigenvalues in . The #neutral polys reports the number of characteristic polynomials where all roots have zero real part (here only ). We conjecture that this is always so (and that there is only one matrix for that neutral polynomial). The #neutral matrices column reports the number of matrices where all eigenvalues have zero real part.
multiplicity 1
2 6 2
3 43 2 2
4 413 6 2 2
5 6,920 6 3 2 2
6 166,005 45 6 2 2 2
Table 5: Number of distinct eigenvalues of various multiplicities matrices in . Note that in this class of matrices, diagonal entries of the matrix need not be zero.
#matrices #cpolys #stables #neutral polys #neutral matrices #distinct real
2 8 6 1 1 2 5
3 64 32 3 0 0 29
4 1,024 289 14 1 6 233
5 32,768 4,958 93 0 0 7,363
6 2,097,152 162,059 992 2 430 299,477
7 268,435,456 10,318,948 0 0
Table 6: Some properties of matrices from . The column #stables reports the number of characteristic polynomials with all roots in the left half plane; the corresponding number of matrices is , , , , and . Other columns are as in previous tables. Blank table entries represent unknowns.
multiplicity 1
2 9 1
3 65 0 0
4 689 5 0 0
5 20,565 3 0 0 0
6 887,539 59 9 1 1 1
Table 7: Number of distinct eigenvalues of various multiplicities of matrices from . The diagonal entries are not zero.

Other questions than those answered in these tables can be asked of this data. For instance, one might be interested in the proportion of singular matrices. By asking which characteristic polynomials have zero constant coefficient, and counting the number of matrices that have that characteristic polynomial, one can answer such questions. In the case of six by six upper Hessenberg matrices with population , there are singular matrices, or about

. Recall that for “random” six by six matrices, where the entries are chosen perhaps uniformly over some real interval, the probability of singularity is

zero because such matrices come from a set of measure zero. Yet in applications, the probability of singular matrices is often nonzero because of structure. By looking at Bohemian matrices, we get some idea of the influence of structure for finite dimensions .

5 Upper Hessenberg Matrices

We can make sense of some of those experiments by theoretical results and proofs. We begin with a recurrence relation for the characteristic polynomial for where . Later we will specialize the population to contain only zero and numbers of unit magnitude, usually .

Theorem 1
(3)

with the convention that (, the empty matrix).

Proof

We begin by proving the following equality:

(4)

for .

When the left side of equation (4) reduces to , and the right side reduces to .

Assume inductively that

(5)

for . Then

(6)
(7)
(8)
(9)

Next we prove the theorem. Performing Laplace expansion on the last row of we get

(10)
(11)
(12)
(13)
(14)
(15)

Theorem 2

Expanding as

(16)

we can express the coefficients recursively by equationparentequation

(17a)
(17b)
(17c)
(17d)

Proof

By Theorem 1

(18)

The first term can be written

(19)
(20)
(21)
(22)

and the second term

(23)

Therefore,

Proposition 1

All matrices in are non-derogatory888A non-derogatory matrix is a matrix for which its characteristic polynomial and minimal polynomial coincide (up to a factor of ).

Proof

Let . Because is upper Hessenberg

(24)

for where are some functions of the entries of . Let

(25)

We find and therefore . Continuing recursively for from to 1 we find for and therefore (since for ) for . We have and hence . Thus, no non-zero polynomial of degree less than exists that satisfies . Therefore, the minimal degree non-zero polynomial that satisfies is the characteristic polynomial of .

Definition 3

The characteristic height of a matrix is the height of its characteristic polynomial.

Remark 2

The height of a polynomial is in fact a norm (the infinity norm of the vector of coefficients).

Proposition 2

For any matrix , has the same characteristic height as .

Proposition 3

The maximal characteristic height of occurs when for and .

Proof

Since and , and hence . Let . By Theorem 2

(26)
(27)

and

(28)
(29)

Since , and equations (27) and (29) are independent of and , all must be positive and the maximum characteristic height is attained.

Remark 3

When () and for all , attains maximal characteristic height. By Proposition 2, () and will also attain maximal characteristic height. Both of these cases correspond to upper Hessenberg matrices with a Toeplitz structure as we explore in further detail in the paper [11].

Definition 4

is invariant under multiplication by a fixed unit if ; that is, each entry of , say , is such that is also in . For instance, is invariant under multiplication by . Note that invariance with respect to implies invariance with respect to .

Theorem 3

Suppose and is invariant under multiplication by each and by . Then is similar to a matrix in , and similar to a matrix in .

Proof

We use induction. The case is vacuously upper Hessenberg, though it is

For , partition the matrix as

where for some . Then conjugate by

Clearly is in . By induction the proof is complete.

Remark 4

For clarity, consider the case :

(30)

where and . Then, the following similarity transforms reduce the problem to one in and one in .

(31)
(32)

6 Upper Hessenberg Toeplitz Matrices

Proposition 3 gives matrices in with maximal characteristic height999We did not report the numbers of such matrices and polynomials that we found in our “results” section.. We noticed that they are Toeplitz matrices. This motivated our interest in upper Hessenberg Toeplitz matrices. We summarize some of the results of [11] here.

Definition 5

Consider matrices where for , and . We denote these matrices by and they have a Toeplitz structure.

Remark 5

The characteristic height of is maximal when for .

Proposition 4

Let be a closed and bounded set with , and . Let be upper Hessenberg Toeplitz with . If , attains maximal characteristic height for for all . If , attains maximal characteristic height for for even, and for odd.

Proposition 5

The maximum characteristic height grows at least exponentially in .

The maximum characteristic height approaches as for some constant where is the golden ratio.

Remark 6

This limit is illustrated in Figure 3, motivating this conjecture.

Figure 3: The points are for from 0 to 50000 where is the maximal characteristic height of (i.e. when , for example). The solid line is where is the golden ratio.

7 Zero Diagonal Upper Hessenberg Matrices

Theorem 4

Let for for some fixed positive integer and each . If is normal, i.e. , then for , is symmetric,

-skew symmetric for some fixed

or -skew circulant. These matrices ( symmetric/-skew symmetric, and -skew circulant matrices) are the only normal matrices in . (For , this is only ; for , the symmetric and circulant cases coalesce, so that there are only such matrices.)

Proof

To prove this theorem, we establish a sequence of lemmas. First, we partition . Put

(33)

where

(34)

and

(35)

Then the conditions of normality are

(36)

must equal

(37)

Lemma 1

The first row of contains exactly one nonzero element, say in position .

Proof
(38)

from the upper left corner. Since each nonzero element of has magnitude , exactly one entry must be nonzero.

Lemma 2

If is normal then and is -skew symmetric.

Proof

If is normal, then being equal to implies so that for some with . Then

(39)

and this says times the first row of is the first row of .

But the first row of is because is upper Hessenberg with zero diagonal. Thus the first row of is . Thus

(40)

and

(41)

is normal. Because is normal, and

(42)

we have or

must equal

The lower left block gives