# Hoffmann-Jørgensen Inequalities for Random Walks on the Cone of Positive Definite Matrices

We consider random walks on the cone of m × m positive definite matrices, where the underlying random matrices have orthogonally invariant distributions on the cone and the Riemannian metric is the measure of distance on the cone. By applying results of Khare and Rajaratnam (Ann. Probab., 45 (2017), 4101–4111), we obtain inequalities of Hoffmann-Jørgensen type for such random walks on the cone. In the case of the Wishart distribution W_m(a,I_m), with index parameter a and matrix parameter I_m, the identity matrix, we derive explicit and computable bounds for each term appearing in the Hoffmann-Jørgensen inequalities.

• 2 publications
• 10 publications
06/25/2019

### Riemannian optimization on the simplex of positive definite matrices

We discuss optimization-related ingredients for the Riemannian manifold ...
06/28/2022

### Affine-Invariant Midrange Statistics

We formulate and discuss the affine-invariant matrix midrange problem on...
03/09/2019

### Two-Hop Walks Indicate PageRank Order

This paper shows that pairwise PageRank orders emerge from two-hop walks...
10/08/2019

### On Dimension-free Tail Inequalities for Sums of Random Matrices and Applications

In this paper, we present a new framework to obtain tail inequalities fo...
02/02/2022

### Efficient Random Walks on Riemannian Manifolds

According to a version of Donsker's theorem, geodesic random walks on Ri...
03/04/2021

### Weisfeiler–Leman, Graph Spectra, and Random Walks

The Weisfeiler–Leman algorithm is a ubiquitous tool for the Graph Isomor...
04/09/2011

### Dimension-free tail inequalities for sums of random matrices

We derive exponential tail inequalities for sums of random matrices with...

## 1 Introduction

In this paper, we consider inequalities of Hoffmann-Jørgensen type for random walks on the cone of positive definite matrices. These inequalities are obtained by adapting results of Khare and Rajaratnam [19] obtained in the broader setting of metric semigroups. In studying these inequalities on matrix cones, we are motivated by the appearance of random samples of positive definite matrices in numerous fields, including: statistical inference on Riemannian manifolds [15, 21, 22], microwave engineering [31, p. 156 ff.]

, diffusion tensor imaging

[11, 18], financial time series [5], wireless communication systems [30], polarimetric radar imaging [4], factor analysis [8], and goodness-of-fit testing [14].

The resulting generalized Hoffmann-Jørgensen inequalities on , the cone of positive definite (symmetric) matrices, are derived under the assumption that the underlying random matrices have an orthogonally invariant distribution on the cone. Specializing to the case of the Wishart distribution , with index parameter and identity matrix parameter , we derive explicit and computable bounds for each term appearing in the Hoffmann-Jørgensen inequalities. Although we restrict our attention in this paper to , our methods apply more generally to the cone of Hermitian positive definite matrices and to abstract symmetric cones [10, 12].

In Section 2, we provide notation and preliminaries for random walks on and for the natural action of , the group of nonsingular matrices, on the random walks. We provide in this section some necessary properties of the Riemannian metric, which is perhaps the most prominent metric on , and we show in Proposition 2.3

that the probability distribution of the distance between orthogonally invariant random matrices is preserved under orthogonally invariant random walks.

In Section 3 we derive from a result of Khare and Rajaratnam [19, Theorem A] a generalization, to the cone , of the classical Hoffmann-Jørgensen inequalities. This result, which is stated in Theorem 3.1

, provides bounds on the probability distribution of the distance between

, the th “step” in an invariant random walk from the initial matrix , the identity matrix.

In order to apply an Hoffmann-Jørgensen inequality in a practical setting, it is necessary to calculate or obtain an upper bound for each term appearing in the inequality; this will be seen to be a non-trivial problem for the random walks considered here. In Section 4, we consider the special case in which the random walk is generated by random samples from the Wishart distribution , with index parameter and matrix parameter , the identity matrix; we are particularly interested in Wishart-distributed random walks because of their appearance in the literature on diffusion tensor imaging, factor analysis, and financial volatility. Under the Wishart assumption, we obtain explicit, computable bounds for each term in the Hoffmann-Jørgensen inequalities in Theorem 3.1.

Finally, in Appendix A

, we establish the submartingale properties of some random variables appearing in the generalized Hoffmann-Jørgensen inequalities and show that the corresponding Kolmogorov inequalities reduce to Markov’s inequalities for those variables.

## 2 Random matrices and random walks on Pm

We denote by the general linear group of all real, , nonsingular matrices. The group acts on via the correspondence, , where , , and denotes the transpose of . It is elementary to verify that this correspondence is a group action, i.e., , for all .

The group action also is transitive: given , there exists such that ; simply choose . Alternatively, since , there exist such that , ; then satisfies . Denote by the identity matrix in . Under this group action, the isotropy group of is , the maximal compact subgroup of orthogonal matrices. Thus, the homogeneous space can be identified with through the correspondence for all .

Whenever two random entities and have the same probability distribution, we will write

. Then a random matrix

is said to be orthogonally invariant if for any . Examples of orthogonally invariant distributions are the well-known Wishart distribution [2, Chapter 7]

and the multivariate beta distribution

[2, p. 377]. Similarly, a function is said to be orthogonally invariant if for all and .

If the random matrix is orthogonally invariant, where , then for any ,

 (k1gk2)(k1gk2)′=k1gk2k′2g′k′1=k1gg′k′1=k1Xk′1L=X.

Therefore, for any orthogonally invariant random matrix , we can identify with a corresponding such that is left- and right-invariant under .

Suppose that the random matrix is orthogonally invariant and is finite. Since for all then . It follows from Schur’s Lemma [29, p. 315] that for some constant .

Given orthogonally invariant random matrices , the convolution product, , is defined by convolving the corresponding random matrices and then identifying the outcome in via the natural map from . Concretely, if and for orthogonally bi-invariant random matrices , then .

To express the distribution of in terms of a product of functions of and , we recall that by polar coordinates on matrix space, for some . Hence, by the orthogonal invariance of the random matrix , we have . Since and are orthogonally invariant, then replacing by , and using the identity for any positive definite matrix , we obtain

 kX1/21X2X1/21k′L=kk′X1/21kk′X2kk′X1/21kk′=X1/21X2X1/21.

Consequently,

 X1∘X2L=X1/21X2X1/21. (2.1)

Although the binary operation , , is not associative, the following result shows that associativity holds in distribution for mutually independent, orthogonally invariant random matrices.

###### Lemma 2.1.

Suppose that the random matrices are mutually independent and orthogonally invariant. Then .

Proof.  By (2.1),

 X1∘(X2∘X3) L=X1/21(X2∘X3)X1/21 L=X1/21X1/22X3X1/22X1/21=(X1/22X1/21)′X3(X1/22X1/21).

Since then, by polar coordinates on matrix space, there exist such that . Then,

 X1∘(X2∘X3) L=(X1/21X2X1/21)1/2k′X3k(X1/21X2X1/21)1/2 L=(X1/21X2X1/21)1/2X3(X1/21X2X1/21)1/2 L=(X1∘X2)1/2X3(X1∘X2)1/2 L=(X1∘X2)∘X3.

The proof now is complete.

###### Lemma 2.2.

Suppose that the random matrices are independent and orthogonally invariant. Then .

More generally, if the random matrices are mutually independent and orthogonally invariant then for all permutations , the symmetric group on symbols.

Proof.  Since and have the same spectrum then there exists such that ; therefore,

 X1∘X2L=X1/21X2X1/21=kX1/22X1X1/22k′L=k(X2∘X1)k′.

Since the distributions of and are orthogonally invariant then it follows that

 k(X2∘X1)k′ L=k(k′X2k∘k′X1k)k′ =k(k′X2k)1/2k′X1k(k′X2k)1/2k′ =kk′X1/22kk′X1kk′X1/22kk′ =X1/22X1X1/22 L=X2∘X1.

Therefore, .

Finally, the proof that the distribution of is invariant under permutation of the is obtained by using the property that each permutation can be expressed as a product of transpositions together with the previously-proved commutativity and associativity properties.

For , denote by

the eigenvalues of

. The Riemannian metric on is defined as

. The function is a genuine metric since, for all : , , and

 dR(A,C)≤dR(A,B)+dR(B,C). (2.2)

The quantity is the distance along the shortest geodesic path, on the cone, starting at and ending at [17, 25, 28]. Since the matrices and have the same spectrum, then we also have an alternative formula,

 (2.3)

from which it follows that is orthogonally bi-invariant: for all and .

###### Proposition 2.3.

Let be orthogonally invariant random matrices. Then

 dR(X∘A,X∘B)L=dR(A,B)L=dR(A∘X,B∘X).

Proof.  By (2.1), there exist such that and . Therefore

 dR(X∘A,X∘B) =dR(k1X1/2AX1/2k′1,k2X1/2BX1/2k′2) =dR(kX1/2AX1/2k′,X1/2BX1/2).

where . By polar coordinates on matrix space, there exists such that ; hence . Therefore,

 dR(kX1/2AX1/2k′,X1/2BX1/2) =dR(X1/2k3Ak′3X1/2,X1/2BX1/2) L=dR(X1/2AX1/2,X1/2BX1/2) =dR(A,B),

where the equality in distribution follows by the orthogonal invariance of and the latter equality follows from (2.3).

Again by (2.1) and (2.3), we have

 dR(A∘X,B∘X) L=dR(A1/2XA1/2,B1/2XB1/2) =(m∑j=1[logλj((A1/2XA1/2)−1B1/2XB1/2)]2)1/2.

Since and have the same spectrum then there exists such that ; and similarly, there exists such that . Therefore,

 dR(A∘X,B∘X) =(m∑j=1[logλj((k1X1/2AX1/2k′1)−1k2X1/2BX1/2k′2)]2)1/2 =(m∑j=1[logλj(k1X−1/2A−1X−1/2k′1k2X1/2BX1/2k′2)]2)1/2 =(m∑j=1[logλj(kX−1/2A−1X−1/2k′X1/2BX1/2)]2)1/2.

where . By an earlier argument, there exists such that , and hence . Therefore

 dR(A∘X,B∘X) =(m∑j=1[logλj(X−1/2k3A−1k′3X−1/2X1/2BX1/2)]2)1/2 =(m∑j=1[logλj(k3A−1k′3B)]2)1/2 L=(m∑j=1[logλj(A−1B)]2)1/2 =dR(A,B),

where the equality in distribution follows from the orthogonal invariance of .

###### Remark 2.4.

Many metrics on the cone have been studied in the literature, cf., [24, 26], and it is possible to extend the results in this section to some of those metrics. We mention, in particular, the Thompson metric,

 dT(A,B) :=logmax{λmax(A−1/2BA−1/2),λmax(B−1/2AB−1/2)} =logmax{λmax(A−1B),1/λmin(A−1B)},

, where is the largest eigenvalue of [17]. It can be shown that Proposition 2.3 holds for , also that

 dT(A,B)≤dR(A,B)≤√mdT(A,B),

so that each probabilistic inequality obtained in Section 3 for is equivalent to a corresponding inequality for .

## 3 Hoffmann-Jørgensen inequalities on Pm

Let be a sequence of mutually independent, and orthogonally invariant, random matrices in . Define the partial products, , ; equivalently,

 S1=X1,Sj+1=Sj∘Xj+1≡S1/2jXj+1S1/2j,j≥1. (3.1)

As we noted before, the binary relation is not commutative; however, by Lemma 2.2, because the random matrices and are independent and orthogonally invariant. Therefore, we may apply the results derived by Khare and Rajaratnam [19] (cf., [20]) for commutative semigroups.

We now apply [19, Theorem A] with as the collection of orthogonally invariant random matrices on , being endowed with the binary operation given in (2.1), and being the Riemannian metric . For define

 Mn=max{dR(Im,X1),…,dR(Im,Xn)} (3.2)

and

 Un=max{dR(Im,S1),…,dR(Im,Sn)} (3.3)

and let denote Kronecker’s delta. We also use the notation to equal if a statement is valid and if it is not.

###### Theorem 3.1.

([19, Theorem A])   Let be independent, identically distributed (i.i.d.), orthogonally invariant random matrices in . For and integers , let , and define

 I0={1≤i≤l:[P(Un≤ti)]ni−δi1≤1ni!}. (3.4)

If then

 P(Un>(n\vbox\scalebox{.8}{{ }∙}−1)t0+(2n1−1)t1+2l∑j=2njtj)≤P(Mn>t0)+[P(Un≤t1)]χ(1∉I0)⋅∏j∈I0[P(Un>tj)]nj⋅∏j∉I01nj!(P(Un>tj)P(Un≤tj))nj. (3.5)

Further, let be the order statistics of . Then the inequality (3.5) can be strengthened by replacing by

 P(n∑j=n−n\vbox\scalebox{.8}{{ }∙}+2Y(j)>(n\vbox\scalebox{.8}{{ }∙}\vbox\scalebox{.8}{{ }∙}−1)t0).
###### Remark 3.2.

Note that (3.5) can also be written as

 P(Un>(n\vbox\scalebox{.8}{{ }∙}−1)t0+(2n1−1)t1+2l∑j=2njtj)≤P(Mn>t0)+[P(Un≤t1)]χ(1∉I0)⋅∏lj=1[P(Un>tj)]nj∏j∉I0nj![P(Un≤tj)]nj. (3.6)
###### Example 3.3.

Similar to remarks in [19, p. 4104], if in Theorem 3.1 we set , , and then by (3.4), , and (3.5) reduces to the inequality

 P(Un>t0+3t)≤P(Mn>t0)+[P(Un>t)]2,

which is analogous to classical Hoffmann-Jørgensen inequalities [19, p. 4101].

We prove in Appendix A that if for all then the sequence is a martingale and the sequences and are submartingales. However, these martingale properties lead only to Markov’s inequalities for and and in any case a martingale approach cannot lead to explicit results for specific choices of the distribution of . Therefore we develop in Section 4 an approach that produces detailed inequalities for and in the Wishart case.

In the following result we adopt the convention that an empty product of matrices is the identity matrix.

###### Proposition 3.4.

Let be a sequence of i.i.d. orthogonally invariant random matrices in . Then for all ,

 Sn+1L=Xn+1∘SnL=X1/2n+1X1/2n⋯X1/22X1X1/22⋯X1/2nX1/2n+1. (3.7)

Proof.  First, it follows from Lemma 2.2 and induction on that is orthogonally invariant.

By Lemmas 2.1 and 2.2, the composition of independent, orthogonally invariant random matrices is commutative and associative in distribution. Since

 Sn+1 =Sn∘Xn+1 =(Sn−1∘Xn)∘Xn+1 =((Sn−2∘Xn−1)∘Xn)∘Xn+1 ⋮ =((((X1∘X2)∘X3)∘⋯∘Xn−1)∘Xn)∘Xn+1,

then it follows that . Therefore, . This proves the first equality (in distribution) in (3.7).

To prove the second equality in distribution in (3.7), we will use induction on . Since and are i.i.d. and then

 S2=S1∘X2=X1∘X2L=X2∘X1=X2∘S1=X1/22X1X1/22.

This establishes the second part of (3.7) for .

For the induction step, suppose that (3.7) holds for , where . By (3.1) and the first equality (in distribution) in (3.7),

 Sj+1 =Sj∘Xj+1 L=Xj+1∘Sj L=Xj+1∘(X1/2jX1/2j−1⋯X1/22X1X1/22⋯X1/2j−1X1/2j) L=X1/2j+1X1/2jX1/2j−1⋯X1/22X1X1/22⋯X1/2j−1X1/2jX1/2j+1.

Therefore (3.7) holds for . This completes the proof by induction.

## 4 The random walk with Wishart matrices

In order for the inequalities (3.5) and (3.6) to be applied in practice, it will be necessary to obtain upper bounds for the terms , , and appearing on the right-hand sides of those inequalities. In this section we obtain, for the case in which the random matrices are Wishart-distributed, upper bounds on , , and .

For , the multivariate gamma function is defined as

 Γm(a)=πm(m−1)/4m∏j=1Γ(a−12(j−1)).

For , let denote Lebesgue measure on . A random matrix is said to have the Wishart distribution, denoted

, if the probability density function of

relative to Lebesgue measure on the cone is

 w(x)=12maΓm(a)(detx)a−12(m+1)exp(−12trx),x∈Pm. (4.1)

Throughout this section, we suppose that is a random sample from the Wishart distribution . Note that once we have obtained an upper bound, with , then it follows that

 1P(Un≤t)=11−P(Un>t)≤11−b;

so we only need upper bounds for , , and .

### 4.1 A bound on the distribution function of Mn

Define the normalizing constant,

 cm=πm2/22mam!Γm(a)Γm(m/2). (4.2)

For and , define

 Fa+j(u,v) =∫vuta+j−12(m+1)exp(−t/2)dt, Fa+i,a+j(u,v) =∫vuta+j−12(m+1)exp(−t/2)Fa+i(u,t)dt,

and

 rij(u,v)=Fa+i−1,a+j−1(u,v)−Fa+j−1,a+i−1(u,v).

Define the matrix ; also, for , define the principal submatrices