The statistical analysis of a probability measure on a differentiable manifold is a topic attracting much attention and has developed considerably due to the wide range of interesting applications that require the analysis of data not living in ordinary Euclidean spaces. There is great interest, for example, in trying to process datasets which lie in the space of symmetric positive definite matrices.
This space can be equipped with a Riemannian metric, giving it the structure of a Riemannian homogeneous space of negative curvature. Such Riemannian manifold has been thoroughly studied from many points of view, corresponding to different disciplines, such as harmonic analysis and number theory [1, 2], matrix analysis  and information geometry and multivariate statistics [4, 5].
This and other symmetric spaces have been studied endowed with metrics such as the Rao-Fisher metric or the log-Euclidean metric , in order to develop the tools to properly carry out statistical inference of data on such spaces. The landscape of applications is very vast as it includes, for example, medical imaging [7, 8], continuum mechanics , radar signal processing [10, 11]
and computer vision[12, 13, 14, 15, 16], among many other references that could be quoted here.
In the present paper, the Riemannian metric will be the Rao-Fisher metric, also known as affine-invariant metric, and is the subject of Section I-A. A specific probabilistic model to represent the statistical variability of data in non-Euclidean spaces, such as
, has been developed by introducing the so-called Gaussian probability distributions on Riemannian symmetric spaces[17, 18, 19, 20].
The Riemannian Gaussian distribution will depend on two parameters, and . The form of its probability density function, generalising that of a Gaussian distribution on the Euclidean space , is given by
where is Rao’s Riemannian distance, defined in (5) below, and the density is with respect to the Riemannian volume element of , denoted by . In comparison to a Gaussian distribution on , the normalising factor plays the role of the factor .
. They also play a role in machine learning when the dataset is inherently structured[23, 24, 25].
In addition to the relationship between Riemannian Gaussian distributions and the concept of Riemannian centre of mass and the potential for applications, discussed in [17, 18, 19, 20], we shall show here that one of the hallmarks of these distributions is analytical tractability. In particular, we will show how the obtained distributions are of standard random matrix type and how the ensuing analytical tools associated to the study random matrices can be used effectively to further characterize the distributions analytically [26, 27, 28].
More specifically, we shall use Pfaffian and determinant descriptions of the random matrix ensembles and the classical orthogonal polynomial method in random matrix theory [26, 27, 28]. We do this in the setting which includes skew-orthogonal polynomials, necessary to solve models which do not corresponds to the case of Hermitian matrices, such as the case of the real symmetric matrices, which is central for statistical applications.
We now briefly summarize the derivation of the model for the case of real symmetric matrices, along the lines of [17, 18, 19, 20], to set up the context. Further details are in these references, while we will focus, from Section II on, on the analytical characterization of the joint probability distribution functions using random matrix theory.
I-a The Rao-Fisher metric: distance, geodesics and volume
where denotes the matrix trace. If there is no ambiguity, this can also be written .
where . Rao’s distance is the infimum of taken over all differentiable curves as above.
When equipped with the Rao-Fisher metric, the space is a Riemannian manifold of negative sectional curvature [1, 2]. One implication of this property, since is also complete and simply connected, is that the infimum of is realised by a unique curve , known as the geodesic connecting and . The equation of this curve is
The Rao-Fisher metric measures length, and therefore provides a definition of volume. The Riemannian volume element associated to the Rao-Fisher metric is 
where indices denote matrix entries.
where is the exterior product 
For a class function, it will be integrated out (see [30, Page 71]),
Thus, for and , let be given by,
The normalization constant is then defined to be the integral
where is the Riemannian volume element (6). For any and ,
where is the identity matrix. Let . Then,
with defined in (9).
We show how to use random matrix tools, in particular the evaluation of Pfaffians, to compute analytically for several values of and also in the large limit. This extends some of the results in [17, 18] and the other works discussing Gaussian distributions on Riemannian symmetric spaces, where the case is evaluated analytically and the rest is carried out numerically, with Monte-Carlo methods. Before that, we note that the corresponding model for Hermitian matrices, also discussed in , which is characterized by an analogous expression but with the term squared, is fully solvable with orthogonal polynomials. It has already been studied in the physics literature in detail, but we summarize and develop the results in the next Section and we will study more difficult cases to treat analytically, such as (13), in Section III and in the Appendix. Finally, Section IV will contain a discussion of these statistical distributions from the point of view of diffusion processes.
Ii Full solution of the case of Hermitian matrices
Let us define the function as
that we call the partition function, adopting the language of random matrix theory. While the definition (14) makes sense for every , with a suitable choice of , we will be mainly interested in , corresponding to random matrix ensemble with orthogonal, unitary and symplectic symmetry, respectively. To avoid cluttering overall factors of in what follows, we adopt the normalization of  and choose
but we stress that other suitable choices exist, as for example
, and the present discussion does not depend on such choice. Note that the variance with this normalization is, so in particular it is for but it is for . For in (14) we find
The crucial fact we want to exploit is that can be written in fully standard random matrix form, meaning in terms of a standard Vandermonde determinant. This was done in , in the context of the study of vicious walkers111We detail further the connection with diffusion in the last Section and, later on, it was also crucial to study certain gauge theories of topological type . One can therefore, use all the power of the traditional random matrix tools, such as determinantal expressions and the method of orthogonal polynomials. Denoting and using
we see that
Changing variables to complete the square, it follows that:
which describes the partition function of the Stieltjes-Wigert random matrix model. Note that this model is of the standard random matrix form but with a weight function of log-normal type: . Random matrix theory gives formulas222For the case . The other cases are more complicated as they require skew-orthogonal polynomials.
for any marginal of the joint probability density function of eigenvalues, including the normalization constant, in terms of the polynomials orthogonal with regards to the log-normal weightabove. These polynomials are the Stieltjes-Wigert polynomials , which are essentially a type of -deformed Hermite polynomial.
A Riemannian log-normal probability distribution was already introduced in  (see  for a review) but is not of the random matrix type and not comparable to (19) due to an extra factor in the Vandermonde part.
Another way to see this -deformed structure at play is to define the -parameter
and introduce the -number
which reduces to an ordinary real number, , in the limit . Then we have
where . This latter form shows that the partition function is, up to the explicitly known overall -dependent factor, the -deformation of the Gaussian -ensemble, for every , with as in (20).
We study the partition function . We discuss first the case in the rest of the present Section. The other two cases and are studied in Section III.
Ii-a Moments of the log-normal distribution and the Stieltjes-Wigert orthogonal polynomials
In what follows we exploit the rewriting (19), which allows to directly apply standard results from random matrix theory. Define the inner product
for analytic real functions and . We keep the dependence on implicit in the notation. The inner product of two monomials is333 The same moments can be obtained working on the unit circle, if the function is instead Jacobi’s third theta function, upon a formal replacement . This leads to a unitary matrix model with analogous properties to the models here discussed, see
The same moments can be obtained working on the unit circle, if the function is instead Jacobi’s third theta function, upon a formal replacement
. This leads to a unitary matrix model with analogous properties to the models here discussed, see[37, 38, 39, 40] for details.
For the cases and we will need to define skew-symmetric products, but for , can be evaluated exactly , thanks to the Stieltjes-Wigert polynomials, which are orthonormal polynomials with respect to inner product defined in (23), that is
This family of polynomials is given by
where the -parameter is
(not to be confused with in (20)), and
is the -binomial. Let us denote by the leading coefficient, that is , and define the monic orthogonal polynomials , which satisfy the orthogonality relation
In turn, the inner product of monomials in the determinant can be replaced by , so that the matrix of which we take the determinant becomes diagonal in this basis, and we obtain 
with defined in (27) and the -Gamma function
The dependence on is better captured by rather than itself. For this reason, while our discussion will focus on , in the plots and numerical evaluations we will instead consider .
Ii-B Eigenvalue density,
Let us introduce the eigenvalue density, which by definition is obtained integrating the joint probability density over of the eigenvalues :
Clearly, from the invariance of the integral under permutation of the ’s, we can integrate over any of the variables and obtain the same definition of . Throughout this Subsection we consider . It is convenient to use the change of variables as in (19), and exploit the Stieltjes-Wigert polynomials . We can use the Christoffel-Darboux formula to rewrite the eigenvalue density of any random matrix ensemble as [27, Eq. (5.13)]
where is the weight function, are the monic orthogonal polynomials with respect to the weight , and the prime means derivative with respect to . In our case, and are the monic Stieltjes-Wigert polynomials introduced in the previous Subsection. We find
Undoing the change of variables, using and multiplying the last expression by the Gaussian prefactor coming from the weight function in the Christoffel-Darboux formula (35), we find that takes the form
with the coefficients obtainable from the expressions above. The upshot is that is the sum of Gaussian distributions, centered at the points
It is also easy to check that the coefficients have alternating sign, thus the Gaussians centered at with odd interfere constructively with the other Gaussians at odd positions, and destructively with those at even positions. We therefore expect to be described by peaks and valleys among them. Moreover, the support of the eigenvalue density grows linearly with the product . We plot the eigenvalue densities for various in Figure 3, and in the small and large regime in Figure 4.
Ii-C Limits of
Before delving into a more detailed analysis of for , it is instructive to analyze various limits for generic .
First, it is clear from (22) that sending we recover the usual Gaussian -ensemble, whose partition function in known exactly for every , see [27, Eq. (1.160)]. In particular, with our normalization we have
for even (and a similar expression for odd), and
Another possible limit is the limit. We give the result taking first the large limit and then approximating for large . It is convenient to write as
where we have changed variables and approximated at large . Making the ansatz that grows as for some and of order 1, we see that a saddle point exists in the large limit if the two terms in the exponential in (42) are of the same order in . The first term goes as and the second as , meaning that the large limit requires , that is . This implies that at leading order at large and large .
We can also foresee the form of the sub-leading correction. Indeed, it will come from integrating over fluctuations around the saddle point. The entries of the Hessian matrix from (42) are , and performing the Gaussian integrals and taking into account the coefficient we expect this sub-leading correction to be of order . Note, however, that (42) as it stands is not suitably written to study the saddle point equation. To get rid of the absolute value, we should restrict to the principal Weyl chamber in and look for a saddle point therein. We do not pursue this approach, and instead quote the result found in , where a rigorous approach to this limit has been undertaken, based on work by Baxter .444The extension of Baxter’s result  to arbitrary , as we need, is straightforward. The statistics of extreme eigenvalues in this regime has been recently studied in [45, 46]. One gets:
Ii-C3 Double-scaling limit
Another useful limit is the scaled large limit with and keeping their product fixed. In this limit, the leading contribution to comes from the saddle point configuration, and one finds
where is a -independent quantity, explicitly found solving the saddle point equation. To obtain (44) we have used the normalization , and the normalization (15) is recovered simply shifting in the case.
In  this scaled limit was studied and subdivided in three cases, according to the value of being finite, infinite or . Below, in the last Section, we will discuss the relationship between the probability densities in [17, 18, 19] with diffusion processes and how they appear in the study of Brownian motion on Weyl chambers of Lie groups.
Iii The cases of real symmetric and quaternion Hermitian matrices
According to the results in Subsection II-C, we can extract the behaviour of in certain limits from the knowledge of . Nevertheless, we can do more, and evaluate exactly and using standard tools in random matrix theory.
Iii-a Skew-symmetric products and partition function
We discuss , which is directly related to through (16). In contrast to the case , the partition functions and do not correspond to determinants, but rather to Pfaffians. It is convenient to discuss separately the two cases with even or odd, starting with the former. Before that, we define the skew-symmetric products 
where , and
Note that the coefficient in the exponent of the weight function is as defined in (15).
Computing the skew-symmetric products of any pair of monomials gives
where we have used the change of variables . In (48), is the error function, and in the last line we have used the integration formula for with a Gaussian weight. The error function is odd, , whence both products are manifestly skew-symmetric.
Assuming even, we use the de Bruijn identity  to write
In both cases, we could expand the Pfaffian and obtain , and hence , as a finite sum of terms, explicitly known from (48). The advantage of this approach is that the Pfaffians can be obtained explicitly for fixed with the aid of a computer algebra. We provide numerical results at various and in Tables I and II, computed in Mathematica using the algorithm of .
From we immediately obtain , which we show in Figure 5 as a function of for various , and in Figure 6 as a function of for various fixed values of . We also show the agreement of with the theoretical predictions of Subsection II-C for small and large in Figure 7.
While the Pfaffian representation of is very useful for computational purposes, the simple example can be easily solved by direct integration:
We now observe that the integral of an error function with a Gaussian weight is again an error function and, after simplifications, we obtain
This is a check of the result (49) for the smallest value of . For higher values of a direct computation becomes more involved. However, from the Pfaffian representation, we infer that the result is always a finite sum of products of error functions, weighted by integer powers of . So, for example, de Bruijn’s identity immediately gives