1. Introduction and overview
A multivariate normal distribution is determined by its covariance matrix and its mean vector. So for a fixed , the family of -variate normal distributions is a differentiable manifold which can be identified with the product of the space of positive definite symmetric -matrices by the vector space . For various statistical purposes, it is desirable to have a measure of distance between the elements of . Such a distance measure is provided by the Fisher metric on , which is a Riemannian metric that appears naturally in a certain statistical framework. We briefly review some properties of Fisher metric on the normal distributions in Section 2.
Computing the distances on , however, turns out to be a non-trivial task. Even though explicit forms for the geodesics of the Fisher metric on are known (due to Calvo and Oller ), these only yield explicit formulas for the distance in particular cases. So Lovrič, Min-Oo and Ruh  proposed the use of a different metric in which distances are easier to compute. They map diffeomorphically onto the Riemannian symmetric space . This map is not an isometry between the Fisher metric and the metric of the symmetric space, which we call the Killing metric, but nevertheless, the two metrics are quite similar in appearance. So it is reasonable to ask how different they really are.
In Section 3 we describe the geometry of as a Riemannian homogeneous but non-symmetric space with the Fisher metric. In Theorem A we show that is a bundle whose base is the cone of symmetric positive definite -matrices and whose fiber is . This also gives rise to two pointwise mutually orthogonal foliations, one with leaves isometric to , the other with leaves isometric to .
To make a case for using the Killing metric as a sensible approximation for the Fisher metric, we compare the geometry of the Fisher metric and the geometry of the Killing metric in Section 4. We find that the Levi-Civita connection for the Fisher metric on the leaves is affinely equivalent to the Levi-Civita connection of the Killing metric. So unparameterized geodesics in these leaves are the same for the two metrics. In Theorem B, we show that Killing geodesics orthogonal to a leaf at some point are asymptotically geodesic in the Fisher metric, that is, their defect from being a Fisher geodesic tends to zero as their curve parameter tends to infinity. So we find that for two important classes of unparameterized geodesics, the Killing geodesics approximate or are identical to the corresponding Fisher geodesics. Though this is not an exhaustive comparison, it provides some justification to consider the easier to compute Killing metric as a good approximation for the Fisher metric.
Notations and conventions
Throughout, we will assume matrices to be real-valued. For a matrix , we let denote its transpose. We also write
. The identity matrix is denoted byor . By we denote the elementary matrix whose entry in row , column is , and all other entries are . Its symmetrization is . The canonical basis vectors of are denoted by .
denote the general linear, special linear, and (special) orthogonal groups, respectively. The subgroup of of matrices with positive determinant is denoted by . The affine group is the semidirect product
where the semidirect product is given by for . We also write .
By we denote the set of symmetric -matrices,
We write for the corresponding subspaces of elements with trace . The subset of diagonal matrices in is denoted by .
The set of positive definite symmetric matrices in is denoted by ,
Its subset of unimodular elements is
Recall that and .
2. Some background on information geometry
In this section we briefly review the concepts from information geometry that we use in the following. We mainly follow Amari and Nagaoka’s  presentation.
2.1. The Fisher metric and dual connections
Information geometry provides a framework to study a class of probability distributionsdefined on a sample space and determined by finitely many parameters , where we assume for simplicity that depends smoothly on and . For example, the set of univariate normal distributions is parametrized by the mean
and the variance.
In general, the set of admissible values for can be viewed as an
-dimensional differentiable manifold, and we can define a positive semidefinite bilinear tensoron via
In the following we assume that is positive definite everywhere, so that is a Riemannian manifold. Then is called the Fisher metric on , and is called a statistical manifold.
In addition to the Fisher metric, there are two particular torsion-free affine connections defined on , denoted by and . These connections are dual to each other with respect to , which means that for all vector fields on ,
Moreover, the affine combination
yields the Levi-Civita connection of the Fisher metric .
The letters “e” and “m” stand for “exponential” and “mixture”, respectively, referring to two families of probability distributions in which these connections appear naturally. More generally, there is a whole family of affine connections with associated to , and , . However, we are not concerned with values here.
2.2. Exponential families
An exponential family is a statistical manifold that consists of probability distributions of the form
for given functions and . The normalization of implies
The connections and are distinguished on an exponential family (see Amari and Nagaoka [1, Sections 2.3 and 3.3]).
Let be an exponential family. Then and are flat torsion-free affine connections on .
In fact, the form a flat coordinate system in the sense that , , for the coordinate vector fields . The flat coordinate system for is obtained via a Legendre transform of ,
In the flat -coordinates, the Fisher metric for an exponential family is given as a Hessian metric , or equivalently
We call the potential of the Fisher metric. The dual potential is given by , and in the flat -coordinates, the inverse is given as a Hessian metric
Another important property of exponential families is the following (see Amari and Nagaoka [1, Theorem 2.5]).
A submanifold of an exponential family is totally geodesic in with respect to if and only if is an exponential family itself.
2.3. Normal distributions
The most important exponential family is formed by the normal distributions. An -variate normal distribution is determined by its covariance matrix and its mean by the following formula
so the manifold we are considering is the space . The flat coordinates for the connection are , where
and the flat coordinates for the connection are , where
The potential in these coordinate systems is (compare (2.3))
3. Geometry of the family of normal distributions
In this section we take a closer look at the information geometry of the manifold . Note that as a product of manifolds.
3.1. Basic geometric properties of
If is the Fisher metric on , are two coordinate vector fields in the -directions, and are two coordinate vector fields in the -directions, then the metric tensor is
and the Levi-Civita connection is determined by
Note that the symmetry in these equations is due to the fact that we are looking at coordinate vector fields.
If and are coordinate vector fields in the - and -directions, respectively, then the curvature of the Fisher metric is determined by
We now consider the two foliations of into submanifolds of fixed or , respectively. For fixed , we will write
It follows from (3.1) that the two foliations determined by these submanifolds are orthogonal.
Recall that the second fundamental form of a submanifold of is the normal component of in for two vector fields tangent to . We let denote the coordinate vector field in direction , and we let denote the coordinate vector field in direction . We denote by the set enumerating the coordinates of and by the set enumerating the coordinates of , and set . When we refer to an index , it may mean either a single index from or an index pair from . Then the Christoffel symbols for the Levi-Civita connection are denoted by with .
For any and with respect to the Fisher metric of , the submanifold is totally geodesic.
By (3.2), is tangent to for all . An arbitrary tangent vector field to can be written as , with . Then
This last expression is the induced covariant derivative on the submanifold , since the - and -directions are orthogonal everywhere. Hence the second fundamental form of vanishes, which means is totally geodesic. ∎
For any and with respect to the Fisher metric of , the submanifold is parallel. Also, the second fundamental form of satisfies
for all .
The second fundamental form of is given by
Denote by and the normal and induced connection for , respectively. By (3.2), is a flat connection on . Then the covariant derivative of is given by ()
where the last identity holds since is flat and come from affine coordinates. Hence, we have for all
In this expression, and due to equation in (3.2). These computations imply that , in other words that is parallel.
On the other hand, to compute we use (3.2),
where we have used the identification of basis vector with their corresponding partial differential operators. ∎
From the previous result the submanifold is not totally geodesic. Hence, is not the Riemannian product of and even though they are mutually orthogonal.
3.2. as a homogeneous space
It is well-known that the affine group acts transitively on by
where , , . Furthermore, the action remains transitive when restricted to . The tangent space can be identified with the vector space . Given and , the tangent action of is
Thus we can identify
The affine group acts transitively and isometrically on by (3.4). Moreover, if denotes the subgroup of lower triangular matrices with positive diagonal entries, then the subgroup acts simply transitively on .
This shows that the action is isometric.
Note that is equivalent to , . So the stabilizer of at is . From the Iwasawa decomposition it follows that acts simply transitively. ∎
3.3. Geometry of
As a consequence of Proposition 3.1 and Theorem 2.2, the Fisher metric of the family of normal distributions with mean coincides with the restriction of the Fisher metric of to . Since all of these submanifolds are isometric, we may take for convenience. In the following, we will make explicit how with its Fisher metric is isometric to a symmetric space with a suitably scaled Killing metric.
Consider the product of irreducible Riemannian symmetric spaces
where its Riemannian metric is the product of the metric , which is times the multiplication on , and the metric on given by . Let act on via
The -action on given above is by isometries.
The tangent action of at on is
Hence the action of is isometric. ∎
Now define a map
Note that for ,
So the map is -equivariant.
The Riemannian manifold is isometric to the product of the irreducible Riemannian symmetric spaces and . In particular, is a Riemannian symmetric space.
The map defined in (3.6) is the desired isometry. In fact, is -equivariant with respect to the isometric -actions on and , and since (where ), it is enough to show that is an isometry at . So let . The differential of at is
This shows that is an isometry and concludes the proof of the proposition. ∎
Let and let be a subgroup of such that . Let , denote the respective Lie algebras of , , and the Cartan involution. Since is a symmetric product by Proposition 3.5, and split as a products and , , such that and are the symmetric Lie algebras associated to and , respectively (cf. Kobayashi & Nomizu [7, Section XI.5]). Since and is simple, by Helgason [6, Theorem V.4.1]. Hence
and clearly , so that . ∎
3.4. Bundle geometry and foliations on
Let denote the Fisher metric on . We can now describe the geometry of in terms of Riemannian symmetric spaces.
Consider the family of -variate normal distributions equipped with the Fisher metric , given by (3.1). The following hold:
is a vector bundle
where the base is equipped with the Fisher metric and the fiber over is with scalar product determined by .
The base can be identified with the totally geodesic submanifold for any , and it is isometric to a product of irreducible Riemannian symmetric spaces
with the metrics on the factors given in Proposition 3.5.
The fiber over can be embedded as a parallel submanifold for any fixed , and as such it is orthogonal at to the embedding of the base as .