1 Introduction
Principal Component Analysis (PCA) is a widely used dimensionality reduction technique in Science and Engineering. PCA however requires the input data to lie in a vector space. With the advent of new technologies and wide spread use of sophisticated feature extraction methods, manifoldvalued data have become ubiquitous in many fields including Computer Vision, Medical Imaging and Machine Learning. A nonlinear version of PCA, called Principal Geodesic Analysis (PGA), for data lying on Riemannian manifolds was introduced in [8].
Since the objective function of PGA is highly nonlinear and hard to solve in general, researchers proposed a linearized version of PGA [8]. Though this linearized PGA, hereafter referred to as PGA, is computationally efficient, it lacks accuracy for data with large spread/variance. In order to solve the objective function exactly, some researchers proposed to solve the original objective function (not the approximation) and called it exact PGA [24]. While exact PGA attempts to solve this complex nonlinear optimization problem, it is however computationally inefficient. Though it is not possible to efficiently and accurately solve this optimization problem for a general manifold, however, for manifolds with constant sectional curvature, we formulate an efficient and exact PGA algorithm, dubbed CCMEPGA. It is well known in geometry, by virtue of KillingHopf theorem, [4] that any nonzero constant curvature manifold is isomorphic to either the hypersphere () or the hyperbolic space (), hence in this work, we present the CCMEPGA formulation for () and (). Our formulation has several applications to Computer Vision and Statistics including directional data [20] and color spaces [18]. Several other applications of hyperbolic geometry are, shape analysis [29], electrical Impedence Tomography, Geoscience imaging [27], brain morphometry [28], catadiaoptric vision [3] etc.
In order to depict the effectiveness of our proposed CCMEPGA, we use the average projection error as defined in [24]. We also report the computational time comparison of the CCMEPGA with the PGA [8] and the exact PGA [24]. Several variants of PGA exist in literature and we briefly mention a few here. In [22], authors computed the principal geodesics without approximation only for a special Lie group, . Geodesic PCA (GPCA) [13, 12] solves a different optimization function namely, optimizing the projection error along geodesics. The authors in [12] minimize the projection error instead of maximizing variance in geodesic subspaces (defined later in the paper). GPCA does not use a linear approximation, but it is restricted to manifolds where a closed form expression for the geodesics exists. More recently, a probabilistic version of PGA called PPGA was presented in [30], which is a nonlinear version of PPCA [26]. None of these methods attempt to compute the solution to the exact PGA problem defined in [24].
The rest of the paper is organized as follows. In Section 2, we present the formulation of PGA. We also discuss the details of the linearized version of PGA [8] and exact PGA [24]. Our formulation of CCMEPGA is presented in Section 2. We present experimental results of CCMEPGA algorithm along with comparisons to exact PGA and PGA in Section 3. In addition to synthetic data experiments, we present the comparative performance of CCMEPGA on two real data applications. In Section 4, we present the formulation for reconstruction of data from principal directions and components in this nonlinear setting. Finally, in section 5, we draw conclusions.
2 Principal Geodesic Analysis
Principal Component Analysis (PCA) [16] is a well known and widely used statistical method for dimensionality reduction. Given a vector valued dataset, it returns a sequence of linear subspaces that maximize the variance of the projected data. The subspace is spanned by the principal vectors which are mutually orthogonal. PCA is well suited for vectorvalued data sets but not for manifoldvalued inputs. A decade ago, the nonlinear version called the Principal Geodesic Analysis (PGA) was developed to cope with manifoldvalued inputs [8]. In this section, first, we briefly describe this PGA algorithm, then, we show the key modification performed in [24] to arrive at what they termed as the exact PGA algorithm. We then motivate and present our approach which leads to an efficient and novel algorithm for exact PGA on constant curvature manifolds (CCMEPGA).
Let be a Riemannian manifold. Let’s suppose we are given a dataset, , where . Let us assume that the finite sample Fréchet mean [9] of the data set exists and be denoted by . Let be the space spanned by mutually orthogonal vectors (principal directions) , . Let be the geodesic subspace of , i.e., , where is the Riemannian exponential map (see [4] for definition). Then, the principal directions, are defined recursively by
(1)  
(2) 
where is the geodesic distance between and , is the point in closest to .
The PGA algorithm on is summarized in Alg. 1.
2.1 PGA and exact PGA
In Alg. 1 (lines ), as the projection operator is hard to compute, hence a common alternative is to locally linearize the manifold. This approach [8] maps all data points on the tangent space at , and as tangent plane is a vector space, one can use PCA to get the principal directions. This simple scheme is an approximation of the PGA. This motivated the researchers to ask the following question: Is it possible to do PGA, i.e., solve Eq. (1) without any linearization? The answer is yes. But, computation of the projection operator, , i.e., the closest point of in is computationally expensive. In [24], Sommer et al. give an alternative definition of PGA, i.e., they minimize the average squared reconstruction error, i.e., instead of in eqns. (1). They use an optimization scheme to compute this projection. Further, they termed their algorithm, exact PGA, as it does not require any linearization. However, their optimization scheme is in general computationally expensive and for a data set with large variance, convergence is not guaranteed. Hence, for large variance data, their exact PGA is still an approximation as it might not converge. This motivates us to formulate an accurate and computationally efficient exact PGA.
2.2 Efficient and accurate exact PGA
In this paper, we present an analytic expression for the projected point and design an effective way to project data points on to the codimension submanifold (as in 1, line ). An analytic expression is in general not possible to derive for arbitrary Riemannian manifolds. However, for constant curvature Riemannian manifolds, i.e., (positive constant curvature) and (negative constant curvature), we derive an analytic expression for the projected point and devise an efficient algorithm to project data points on to a codimension submanifold. Both these manifolds are quite commonly encountered in Computer Vision [10, 19, 3, 28, 29] as well as many other fields of Science and Engineering. The former more so than the latter. Eventhough, there are applications that can be naturally posed in hyperbolic spaces (e.g., color spaces in Vision [18], catadiaoptric images [3] etc.), their full potential has not yet been exploited in Computer Vision research as much as in the former case.
We first present some background material for the dimensional spherical and hyperbolic manifolds and then derive an analytical expression of the projected point.
2.2.1 Basic Riemannian Geometry of

Geodesic distance: The geodesic distance between is given by, .

Exponential Map: Given a vector , the Riemannian Exponential map on is defined as . The Exponential map gives the point which is located on the great circle along the direction defined by the tangent vector at a distance from .

Inverse Exponential Map: The tangent vector directed from to is given by, where, . Note that, the inverse exponential map is defined everywhere on the hypersphere except at the antipodal points.
2.2.2 Basic Riemannian Geometry of
The hyperbolic dimensional manifold can be embedded in using three different models. In this discussion, we have used the hyperboloid model [14]. In this model, is defined as , where the inner product on , denoted by is defined as .

Geodesic distance: The geodesic distance between is given by, .

Exponential Map: Given a vector , the Riemannian Exponential map on is defined as, .

Inverse Exponential Map: The tangent vector directed from to is given by where, .
For the rest of this paper, we consider the underlying manifold, , as a constant curvature Riemannian manifold, i.e., is diffeomorphic to either or , where [4]. Let , . Further, let be defined as the projection of on the geodesic submanifold defined by and . Now, we will derive a closed form expression for in the case of and .
2.3 Analytic expression for on
Theorem 1.
Let and . Then the projection of on the geodesic submanifold defined by and , i.e., is given by:
(3) 
Proof.
Consider the spherical triangle shown in Fig. 1, , where . Let, , and . Also, let , , . Clearly, since is the projected point , . So,
(4) 
Here, denotes the Euclidean inner product, where both and are viewed as points in , i.e., the ambient space. Note that, , as . From spherical trigonometry, we know that .
(5) 
Hence, using the Exponential map, we can show that is given by,
(6) 
∎
Analogously, we can derive the formula for on , .
Theorem 2.
Let and . Then the projection of on the geodesic submanifold defined by and , i.e., is given by:
(7) 
where,
.
Proof.
As before, consider the hyperbolic triangle shown in, , where . Let, , and . Also, let , , . Clearly, since is the projected point , . Then, is the angle between and . Hence,
(8) 
Then, from hyperbolic trigonometry, as , we get
(9) 
Note that, as the arc length between and is , hence, using Exponential map, we can show that .
∎
Given the closed form expression of the projected point, now we are in a position to devise an efficient projection algorithm (for line in Alg. 1), which is given in Alg. 2. Note that, using Alg. 2, data points on the current submanifold are projected to a submanifold of dimension one less, which is needed by the PGA algorithm in line 6. Also note that, in order to ensure existence and uniqueness of FM, we have restricted the data points to be within a geodesic ball of convexity radius [1].
Note that in order to descend to the codimension1 submanifolds, we use step1 and step2 instead of the optimization method used in the exact PGA algorithm of [24].
3 Experimental Results
In this section, we present experiments demonstrating the performance of CCMEPGA compared to PGA [8] and exact PGA [24]. We used the average projection error, defined in [24], as a measure of performance in our experiments. The average projection error is defined as follows. Let be data points on a manifold . Let be the mean of the data points. Let, be the first principal direction and . Then the error is defined as follows:
(10) 
where is the geodesic distance function on . We also present the computation time for each of the three algorithms. All the experimental results reported here were obtained on a desktop with a single 3.33 GHz Inteli7 CPU with 24 GB RAM.
3.1 Comparative performance of CCMEPGA on Synthetic data
In this section, we present the comparitive performance of CCMEPGA on several synthetic datasets. For each of the synthetic data, we have reported the average projection error and computation time for all three PGA algorithms in Table 1. All the four datasets are on and the Fréchet mean is at the ”north pole”. For all the datasets, samples are in the northern hemisphere to ensure that the Fréchet mean is unique. Data_1 and Data_2 are generated by taking samples along a geodesic with a slight perturbation. The last two datasets are constructed by drawing random samples on the northern hemisphere. In addition, data points from Data_1 are depicted in Fig. 2. The first principal direction is also shown (black for CCMEPGA, blue for PGA and red for exact PGA). Further, we also report the data variance for these synthetic datasets. By examining the results, it’s evident that for data with low variance, the significance of CCMEPGA in terms of projection error is marginal, while for high variance data, CCMEPGA yields significantly better accuracy. Also, CCMEPGA is computationally very fast in comparison to exact PGA. The results in Table 1 indicate that CCMEPGA outperforms exact PGA in terms of efficiency and accuracy. Although, the required time for PGA is less than that of CCMEPGA, in terms of accuracy, CCMEPGA dominates PGA.
Data  Var.  CCMEPGA  PGA  exact PGA  

avg. proj. err.  Time(s)  avg. proj. err.  Time(s)  avg. proj. err.  Time(s)  
Data_1  2.16  0.70  0.174  2.54e02  14853  
Data_2  0.95  0.27  0.59  0.59  84.38  
Data_3  7.1e03  0.19  0.55  0.55  16.87  
Data_4  5.9e02  0.33  0.37  0.37  71.84 
3.2 Comparative performance on pointset data ( example)
In this section, we depict the performance of the proposed CCMEPGA on 2D pointset data. The database is called GatorBait100 dataset [21]. This dataset consists of images of shapes of different fishes. From each of these images of size , we first extract boundary points, then we apply the Schrödinger distance transform [7] to map each of these point sets on a hypersphere (). Hence, this data consists of pointsets each of which lie on . As before, we have used the average projection error [24], to measure the performance of algorithms in the comparisons. Additionally, we report the computation time for each of these PGA algorithms. We used the code available online for exact PGA [23]. This online implementation is not scalable to large (even moderate) number of data points, and furhter requires the computation of the Hessian matrix in the optimization step, which is computationally expensive. Hence, for this real data application on the high dimensional hypersphere, we could not report the results for the exact PGA algorithm. Though one can use a Sparse matrix version of the exact PGA code, along with efficient parallelization to make the exact PGA algorithm suitable for moderately large data, we would like to point out that since our algorithm does not need such modifications, it clearly gives CCMEPGA an advantage over exact PGA from a computational efficiency perspective. In terms of accuracy, it can be clearly seen that CCMEPGA outperforms exact PGA from the results on synthetic datasets. Both average projection error and computational time on GatorBait100 dataset are reported in Table 2. This result demonstrates accuracy of CCMEPGA over the PGA algorithm with a marginal sacrifice in efficiency but significant gains in accuracy.
Method  avg. proj. error  Time(s) 

CCMEPGA  
PGA 
3.3 PGA on population of gaussian distributions ( example)
In this section, we propose a novel scheme to compute principal geodesic submanifolds for the manifold of gaussian densities. Here, we use concepts from information geometry presented in [2], specifically, the Fisher information matrix [17] to define a metric on this manifold [6]. Consider a normal density in an dimensional space, with parameters represented by . Then the entry of the Fisher matrix, denoted by , is defined as follows:
(11) 
For example, for a univariate normal density , the fisher information matrix is
(12) 
So, the metric is defined as follows:
(13) 
where
is the Fisher information matrix. Now, consider the parameter space for the univariate normal distributions. The parameter space is
, i.e., positive half space, which is the Hyperbolic space, modeled by the Poincaré halfplane, denoted by . We can define a bijection as . Hence, the univariate normal distributions can be parametrized by the dimensional hyperbolic space. Moreover, there exists a diffeomorphsim between and (the mapping is analogous to stereographic projection for ), thus, we can readily use the formulation in Section 2 to compute principal geodesic submanifold on the manifold of univariate normal distributions.Motivated by the above formulation, we ask the following question: Does there exist a similar relation for multivariate normal distributions? The answer is no in general. But if the multivariate distributions have diagonal covariance matrix, (i.e., independent uncorrelated variables in the mutivariate case), the above relation between and can be generalized. Consider an dimensional normal distribution parametrized by where and is a diagonal positive definite matrix (i.e., , if , else ). Then, analogous to the univariate normal distribution case, we can define a bijection as follows:
(14) 
Hence, we can use our formulation in Section 2 since there is a diffeomorphism between and . But, for general nondiagonal dimensional covariance matrix space, , the above formulation does not hold. This motivated us to go one step further to search for a parametrization of where we can use the above formulation. In [15], authors have used the Iwasawa coordinates to parametrize . Using the Iwasawa coordinates [25], we can get a onetoone mapping between and the product manifold of and , where is manifold of dimensional diagonal positive definite matrix and is the space of dimensional upper triangular matrices, which is isomorphic to . We have used the formulation in [25], as discussed below.
Let , then we can use Iwasawa decomposition to represent as a tuple . And repeating the following partial Iwasawa decomposition:
(15) 
where and . We get the following vectorized expression: . Note that as each of is , we can construct a positive definite diagonal matrix with diagonal entry being . And as each is in , we will arrange them columnwise to form a upper triangular matrix. Thus, for , we can use our formulation for the hyperboloid model of the hyperbolic space given in Section 2, and the standard PCA can be applied for .
We now use the above formulation to compute the principal geodesic submanifolds for a covariance descriptor representation of Brodatz texture dataset [5]. Similar to our previous experiment on pointset data, in this experiment, we report the average projection error and computation time. We adopt a similar procedure as in [11] to derive the covariance descriptors for the texture images in the Brodatz database. Each texture image is first partitioned into nonoverlapping blocks. Then, for each block, the covariance matrix () is summed over the blocks. Here, the covariance matrix is computed from the feature vector . We make the covariance matrix positive definite by adding a small positive diagonal matrix. Then, each image is represented as a normal distribution with zero mean and this computed covariance matrix. Then, we used the above formulation to map each normal distribution on to . The comparative results of CCMEPGA with PGA and exact PGA are presented in Table 3. The results clearly indicate efficiency and accuracy of CCMEPGA compared to PGA and exact PGA algorithms.
Method  avg. proj. error  Time(s) 

CCMEPGA  
PGA  
exact PGA 
4 Data Reconstruction from principal directions and coefficients
In this section, we discuss a recursive scheme to approximate an original data point with principal directions and coefficients. We discuss the reconstruction for data on , the reconstruction for data points on can be done in an analogous manner. Let be the data point and the principal vector. is the principal component of . Note that on , principal component of a data point is . Let be the approximated from the first principal components. Let be . Let be the parallel transported vector from to . Let, be the parallel transported version of to . We refer readers to Fig. 3 for a geometric interpretation.
Now, we will formulate a recursive scheme to reconstruct . Let us reconstruct the data using the first principal components. Then, the approximated point is the intersection of two geodesics defined by , and , . Let these two great circles be denoted by
(16)  
(17) 
At and , let . Since, and are mutually orthogonal, we get,
(18) 
Note that, as our goal is to solve for or to get , we need two equations. The second equation can be derived as follows:
This leads to,
(19) 
Then, by solving Eqs. (18) and (19) we get,
(20) 
where,
and
By solving the equation (20), we get
(21) 
where, is the signum function. Hence, . This completes the reconstruction algorithm. Our future efforts will be focused on using this reconstruction algorithm in a variety of applications mentioned earlier.
5 Conclusions
In this paper, we presented an efficient and accurate exactPGA algorithm for (nonzero) constant curvature manifolds, namely the hypersphere and the hyperbolic space . We presented analytic expressions for the shortest distance between the data point and the geodesic submanifolds, which is required in the PGA algorithm and in general involves solving a difficult optimization problem. Using these analytic expressions, we achieve a much more accurate and efficient solution for PGA on constant curvature manifolds, that are frequently encountered in Computer Vision, Medical Imaging and Machine Learning tasks. We presented comparison results on synthetic and real data sets demonstrating favorable performance of our algorithm in comparison to the stateoftheart.
References
 [1] B. Afsari. Riemannian L^p center of mass: Existence, uniqueness, and convexity. Proceedings of the American Mathematical Society, 139(2):655–673, 2011.
 [2] S.i. Amari and H. Nagaoka. Methods of information geometry, volume 191. American Mathematical Soc., 2007.
 [3] I. Bogdanova, X. Bresson, J.P. Thiran, and P. Vandergheynst. Scale space analysis and active contours for omnidirectional images. Image Processing, IEEE Transactions on, 16(7):1888–1901, 2007.
 [4] W. M. Boothby. An introduction to differentiable manifolds and Riemannian geometry, volume 120. Academic press, 1986.
 [5] P. Brodatz. Textures: a photographic album for artists and designers. Dover Pubns, 1966.
 [6] S. I. Costa, S. A. Santos, and J. E. Strapasson. Fisher information distance: a geometrical reading. Discrete Applied Mathematics, 2014.

[7]
Y. Deng, A. Rangarajan, S. Eisenschenk, and B. C. Vemuri.
A riemannian framework for matching point clouds represented by the
schrödinger distance transform.
In
Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on
, pages 3756–3761. IEEE, 2014.  [8] P. T. Fletcher, C. Lu, S. M. Pizer, and S. Joshi. Principal geodesic analysis for the study of nonlinear statistics of shape. Medical Imaging, IEEE Transactions on, 23(8):995–1005, 2004.
 [9] M. Fréchet. Les éléments aléatoires de nature quelconque dans un espace distancié. In Annales de l’institut Henri Poincaré, volume 10, pages 215–310. Presses universitaires de France, 1948.
 [10] R. Hartley, J. Trumpf, Y. Dai, and H. Li. Rotation averaging. International journal of computer vision, 103(3):267–305, 2013.
 [11] J. Ho, Y. Xie, and B. Vemuri. On a nonlinear generalization of sparse coding and dictionary learning. In Proceedings of The 30th International Conference on Machine Learning, pages 1480–1488, 2013.
 [12] S. Huckemann, T. Hotz, and A. Munk. Intrinsic shape analysis: Geodesic pca for riemannian manifolds modulo isometric lie group actions. Statistica Sinica, 2010.

[13]
S. Huckemann and H. Ziezold.
Principal component analysis for riemannian manifolds, with an
application to triangular shape spaces.
Advances in Applied Probability
, pages 299–319, 2006.  [14] B. Iversen. Hyperbolic geometry, volume 25. Cambridge University Press, 1992.
 [15] B. Jian and B. C. Vemuri. Metric learning using iwasawa decomposition. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pages 1–6. IEEE, 2007.
 [16] I. Jolliffe. Principal component analysis. Wiley Online Library, 2002.

[17]
E. L. Lehmann and G. Casella.
Theory of point estimation
, volume 31. Springer Science & Business Media, 1998.  [18] R. Lenz, P. L. Carmona, and P. Meer. The hyperbolic geometry of illuminationinduced chromaticity changes. In Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, pages 1–6. IEEE, 2007.
 [19] R. Lenz and G. Granlund. If i had a fisheye i would not need so (1, n), or is hyperbolic geometry useful in image processing. In Proc. of the SSAB Symposium, Uppsala, Sweden, pages 49–52, 1998.
 [20] K. Mardia and I. Dryden. Shape distributions for landmark data. Advances in Applied Probability, pages 742–755, 1989.
 [21] A. Rangarajan. https://www.cise.ufl.edu/~anand/publications.html.
 [22] S. Said, N. Courty, N. Le Bihan, and S. J. Sangwine. Exact principal geodesic analysis for data on so (3). In 15th European Signal Processing Conference (EUSIPCO2007), pages 1700–1705. EURASIP, 2007.
 [23] S. Sommer. https://github.com/nefan/smanifold, 2014.
 [24] S. Sommer, F. Lauze, S. Hauberg, and M. Nielsen. Manifold valued statistics, exact principal geodesic analysis and the effect of linear approximations. In Computer Vision–ECCV 2010, pages 43–56. Springer, 2010.
 [25] A. Terras. Harmonic analysis on symmetric spaces and applications. Springer, 1985.
 [26] M. E. Tipping and C. M. Bishop. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3):611–622, 1999.
 [27] G. Uhlmann. Inverse Problems and Applications: Inside Out II, volume 60. Cambridge University Press, 2013.
 [28] Y. Wang, W. Dai, X. Gu, T. F. Chan, S.T. Yau, A. W. Toga, and P. M. Thompson. Teichmüller shape space theory and its application to brain morphometry. In Medical Image Computing and ComputerAssisted Intervention–MICCAI 2009, pages 133–140. Springer, 2009.
 [29] W. Zeng, D. Samaras, and X. D. Gu. Ricci flow for 3d shape analysis. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(4):662–677, 2010.
 [30] M. Zhang and P. T. Fletcher. Probabilistic principal geodesic analysis. In Advances in Neural Information Processing Systems, pages 1178–1186, 2013.
Comments
There are no comments yet.