Hermite Interpolation and data processing errors on Riemannian matrix manifolds

08/16/2019 ∙ by Ralf Zimmermann, et al. ∙ SDU 0

The main contribution of this paper is twofold: On the one hand, a general framework for performing Hermite interpolation on Riemannian manifolds is presented. The method is applicable, if algorithms for the associated Riemannian exponential and logarithm mappings are available. This includes many of the matrix manifolds that arise in practical Riemannian computing application such as data analysis and signal processing, computer vision and image processing, structured matrix optimization problems and model reduction. On the other hand, we expose a natural relation between data processing errors and the sectional curvature of the manifold in question. This provides general error bounds for manifold data processing methods that rely on Riemannian normal coordinates. Numerical experiments are conducted for the compact Stiefel manifold of rectangular column-orthogonal matrices. As use cases, we compute Hermite interpolation curves for orthogonal matrix factorizations such as the singular value decomposition and the QR-decomposition.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Given a data set that consists of locations , function values and derivatives , the (first-order) Hermite interpolation problem reads:

Find a polynomial of suitable degree such that

(1)

Local cubic Hermite interpolation is the special case of Hermite-interpolating a two-points data set on . Cubic Hermite interpolation is achieved by joining the local pieces on each sub-interval . By construction, the derivative at the end point of coincides with the derivative of the start point of so that the resulting curve is globally , [19, Remark 7.7].

In this paper, we address the Hermite interpolation problem for a function that takes values on a Riemannian manifold with tangent bundle . More precisely, consider a differentiable function

and a sample plan . Sampling of the function values and the derivatives of at the parameter instants produces a data set consisting of manifold locations

and velocity vectors

in the respective tangent spaces of at . The Hermite manifold interpolation problem is:

Find a curve of class such that

(2)

1.1 Original contributions

(1) We introduce a method to tackle problem (2) that is a direct analogue to Hermite interpolation in Euclidean spaces. The method has the following features:

  1. The approach works on arbitrary Riemannian manifolds, i.e., no special structure (Lie Group, homogeneous space, symmetric space,…) is required.
    In order to conduct practical computations, only algorithms for evaluating the Riemannian exponential map and the Riemannian logarithm map must be available.

  2. The computational effort, in particular, the number of Riemannian exp and log evaluations is lower than that of any other Hermite manifold interpolation method known to the author.

(2) In addition, we expose a natural relation between data processing errors and the sectional curvature of the manifold in question. This provides general error bounds for data processing methods (not limited to interpolation) that work via a back-and-forth mapping of data between the manifold and its tangent space, or, more precisely, data processing methods that rely on Riemannian normal coordinates.

For convenience, the exposition will focus on cubic polynomial Hermite interpolation. However, the techniques may be readily combined with any interpolation method that is linear in the sampled locations and derivative values. Apart from polynomial interpolation, this includes radial basis function approaches

[4] and gradient-enhanced Kriging [36].

As a use-case, we provide an explicit and efficient method for the cubic Hermite interpolation of column-orthogonal matrices, which form the so-called Stiefel manifold . Stiefel matrices arise in orthogonal matrix factorizations such as the singular value decomposition and the QR-decomposition.

1.2 Comparison with previous work

Interpolation problems with manifold-valued sample data and spline-related approaches have triggered an extensive amount of research work.

It is well-known that cubic splines in Euclidean spaces are acceleration-minimizing. This property allows for a generalization to Riemannian manifolds in form of a variational problem for the intrinsic, covariant acceleration of curves, whose solutions can be interpreted as generalized cubic polynomials on Riemannian manifolds. The variational approach to interpolation on manifolds has been investigated e.g. in [28, 12, 10, 33, 8, 31, 22], see also [29] and references therein. While the property of minimal mean-acceleration is certainly desirable in many a context, including automobile, aircraft and ship designs and digital animations, there is no conceptual reason to impose this condition when interpolating general smooth non-linear manifold-valued functions.

A related line of research is the generalization of Bézier curves and the De Casteljau-algorithm [6] to Riemannian manifolds [29, 23, 27, 1, 15, 32]. Bézier curves in Euclidean spaces are polynomial splines that rely on a number of so-called control points. A Bézier curve starts at the first control point and ends at the last control point, the starting velocity is tangent to the line between the first two-pair of control points; the velocity at the endpoint is tangent to the line between the penultimate and the last control point. This is illustrated in Fig. 1. The number of control points determines the degree of the polynomial spline. To obtain the value of a Bézier curve at time , a recursive sequence of straight-line convex combinations of two locations must be computed. The transition of this technique to Riemannian manifolds is via replacing the inherent straight lines with geodesics [29]. The start and end velocities of the resulting spline are proportional to the velocity vectors of the geodesics that connect the first two and the last two control points, respectively [29, Theorem 1].

Figure 1: A cubic Bézier curve based on four control points . The ‘inner’ control points may be used to prescribe tangent directions at and , which are interpolated.

Note that the actual applications and use cases featured in the work referenced above are almost exclusively on low-dimensional matrix manifolds like or .

A Hermite-type method that is specifically tailored for interpolation problems on the Grassmann manifold is sketched in [4, §3.7.4]. General Hermitian manifold interpolation has been considered explicitly in [20]. The idea is as follows: Given two points on a manifold and two tangent directions , the the authors of [20] approach the task to construct a connecting curve such that by constructing a “left” arc that starts at from with the prescribed velocity and a “right” arc that ends at at with the prescribed velocity . The two arcs are then blended to a single spline arc via a certain geometric convex combination. In Euclidean spaces, this would read , where is a suitable weight function. Because a general Riemannian manifold lacks a vector space structure, the challenge is to construct a manifold analogue of a convex combination and [20] proposes a method that works on compact, connected Lie groups with a bi-invariant metric.

This same idea of blending a left and a right arc has been followed up in [15]. Here, the Euclidean convex combination is replaced with a geodesics average . In combination, this constitutes a valid approach for solving (2) in arbitrary Riemannian manifolds.111In practice, the building arcs and may be taken to be the geodesics with the prescribed velocities in their respective start and end points.

It should be mentioned that none of the papers on Bézier curves referenced above tackle the Hermite interpolation problem explicitly. However, the Bézier approach can be turned into an Hermite method by choosing the control points such that the sampled start and terminal velocities are met. It is clear that this requires at least control points in each subinterval , see Fig. 1.

Interpolation problems on Stiefel Manifolds have been considered in [23], however with using quasi-geodesics rather than geodesics. The work [39] includes preliminary numerical experiments for interpolating orthogonal frames on the Stiefel manifold that relies the canonical Riemannian Stiefel logarithm [30, 37].

Remark: (Hermite) interpolation of curves on Riemannian manifolds, i.e., of manifold-valued functions must not be confused with (Hermite) interpolation of real-valued functions with domain of definition on a manifold, . The latter line of research is pursued, e.g., in [26] but is not considered here.

1.3 Organization

The paper is organized as follows: Starting from the classical Euclidean case, Section 2 introduces an elementary approach to Hermite interpolation on general Riemannian manifolds. Section 3 relates the data processing errors of calculations in Riemannian normal coordinates to the curvature of the manifold in question. In Section 4, the specifics of performing Hermite interpolation of column-orthogonal matrices are discussed and Section 5 illustrates the theory by means of numerical examples. Conclusions are given in Section 6.

1.4 Notational specifics

The -identity matrix is denoted by . If the dimension is clear, we will simply write . The -orthogonal group, i.e., the set of all square orthogonal matrices is denoted by

When we employ the QR-decomposition of a rectangular matrix , we implicitly assume that and refer to the ‘economy size’ QR-decomposition , with , .

The standard matrix exponential and the principal matrix logarithm are defined by

The latter is well-defined for matrices that have no eigenvalues on

.

For a Riemannian manifold , the geodesic that starts from with velocity is denoted by . The Riemannian exponential function at is

(3)

and maps a small star-shaped domain around the origin in diffeomorphically to a domain . The Riemannian logarithm at is

(4)

Recall that for a differentiable function , the differential at is a linear map between the tangent spaces

(5)

2 Hermite interpolation on Riemannian manifolds

In this section, we construct a quasi-cubic spline between two data points on a manifold with prescribed velocities and . To this end, we develop a manifold equivalent to the classical local cubic Hermite interpolation in Euclidean spaces [19, §7].

2.1 The Euclidean case

We start with a short recap of Hermite cubic space curve interpolation, where the following setting is of special interest to our considerations. Let be a real vector space and let be differentiable with , , and derivative data . When applied to vector-valued functions, the classical local cubic Hermite interpolating spline is the space curve that is obtained via a linear combination of the sampled data,222It is an elementary, yet often overlooked fact that for functions , component-wise polynomial interpolation of the coordinate functions is equivalent to interpolating the coefficients in a linear combination of the sampled data vectors.

(6)

For the reader’s convenience, the basic cubic Hermite polynomials coefficient polynomials are listed in Appendix A.

2.2 Transfer to the manifold setting

Let be a Riemannian manifold and consider a differentiable function

Suppose that and and assume further that , where is the injectivity radius of at . The latter condition ensures that the sample data lies within a domain, where the Riemannian normal coordinates are one-to-one, [13, p. 271].

Our approach is to express the interpolating curve in terms of normal coordinates centered at ,

Figure 2: Visualization of the Riemannian exponential map: The tangent velocity is mapped to the endpoint of the geodesic . The Riemannian distance equals the norm associated with the Riemannian metric on .

Hence, the task is transferred to constructing a curve such that the image curve under the exponential function solves the Hermite interpolation problem (2). Because is a vector space, we can utilize the ansatz of (6) but for ,

Here, are the normal coordinate images of the locations and . The tangent vectors play the role of the velocity vectors and must be chosen such that

(7)
(8)

Since the interpolating curve is expressed in normal coordinates centered at , condition (8) is readily fulfilled by selecting : According to the properties of the cubic Hermite coefficient functions , the Taylor expansion of around is . Therefore, up to first order, is a ray emerging from the origin with direction . Hence, the directional derivative of the exponential function is

The latter equation holds, because , [13, §3, Prop. 2.9].

The condition (7) is more challenging, because the Taylor expansion of around is and is not a ray emerging from the origin . As the differential is not the identity for , the computation of is more involved. In fact, it is related to the Jacobi fields on a Riemannian manifold, see [13, §5], [24, §10] and the upcoming Section 3. Yet, for our purposes, it is sufficient to determine the tangent vector such that

(9)

As long as the sample points and are not conjugate, we can make use of the fact that is a local diffeomorphism around , [24, Prop. 10.11]. Hence, under this assumption, (9) is equivalent to

Recall that is the given sample data. In summary, we have proved the following theorem. Let and let be a differentiable function on a Riemannian manifold . Suppose that

and assume that and are not conjugate along the geodesic that connects and . Set and

Then

with the cubic Hermite coefficient functions as in (28)– (31) is a differentiable curve that solves the Hermite interpolation problem (2).

Remark 1

Consider an Hermite sample set , . Then, by construction and in complete analogy to the Euclidean case, the composite curve

(10)

that combines the local quasi-cubic spline arcs of Theorem 2.2 is of class and solves the Hermite manifold interpolation problem (2).

Practical computation of

In cases, where an explicit formula for the Riemannian logarithm is at hand, the directional derivative can be directly computed. For general nonlinear manifolds , computing the differentials of the Riemannian exponential and logarithm is rather involved. According to (3), (4), (5), it holds

(11)
(12)

with the usual identification a linear space with its tangent space.

In order to evaluate , we can take any differentiable curve that satisfies and . Then,

(13)

An obvious choice is . The final equation for computing as required by (7) is

(14)

The composite map is in fact a transition function for the normal coordinate charts. It is defined on an open subset of a Hilbert space and maps to a Hilbert space, see [25, Fig. 1.6, p. 12] for an illustration. Hence, we can approximate the directional derivative via finite difference approaches:

(15)

2.3 Computational effort and preliminary comparison to other methods

Computationally, the most involved numerical operations are the evaluations of Riemannian - and -mappings. Therefore, as in [15], we measure the computational effort associated with the Hermite interpolation method as the number of such function evaluations.

Constructing a quasi-cubic Hermite interpolant as in Remark 1 requires on each subinterval

  • one Riemannian logarithm to compute ,

  • two Riemannian - and -evaluations for the central difference approximation of (15),

which results in a total of Riemannian -evaluations and Riemannian -evaluations for the whole composite curve. The data to represent the curve (10) can be precomputed and stored.

Evaluating a quasi-cubic Hermite interpolant at time requires a single Riemannian -evaluation.

As mentioned in the introduction, Bézier-like approaches may be used to tackle the the Hermite interpolation problem (2). This requires a cubic degree and at least four control points on each sub-interval to impose the derivative constraints, see Fig. 1. The most efficient of such methods in [15] requires Riemannian -evaluations for constructing the curve data. Evaluating the curve at time requires Riemannian -evaluations plus Riemannian -evaluation [15, Prop. 5.10].

3 Error propagation

The approach introduced in Section 2.2

follows the standard principle of (1) mapping the sampled data onto the tangent space, (2) performing data processing (in this case, interpolation) in the tangent space, (3) mapping the result back to the curved manifold. In this section, we perform a general qualitative analysis of the behavior of the actual errors on the manifold in question in relation to the data processing errors that accumulate in the tangent space. In particular, this allows to obtain error estimates for any manifold interpolation procedure based on the above standard principle and also applies to other data processing operations that subordinate to this pattern. In essence, the error propagation is related to the manifold’s curvature via a standard result from differential geometry on the spreading of geodesics

[13, Chapter 5, §2]. Let be a Riemannian manifold, let and consider tangent vectors , which are to be interpreted as exact datum and associated approximation. Write , , where it is understood that the norm is that of . Assume that . Let and let be the sectional curvature at with respect to the -plane .

If is the angle between and , then the Riemannian distance between the manifold locations and is

(16)

with the underlying assumption that all data is within the injectivity radius at .

Figure 3: A

Formally, it holds . However, the data is given in normal coordinates around and not around . The Riemannian exponential is a radial isometry (lengths of rays starting from the origin of the tangent space equal the lengths of the corresponding geodesics). Yet it is not an isometry so that , unless is flat. Therefore, we will estimate the distance against a component that corresponds to the length of a ray in and a circular segment in .

To this end, introduce an orthonormal basis for the plane via

The circular segment of the unit circle in the -plane that starts from and ends in can be parameterized via the curve

Let be the angle such that . This setup is illustrated in Figure 3, where the outer dashed circular arc indicates the unit circle and the solid circular arcs are the circles of radius and , respectively. By the triangle inequality,

(17)
(19)

Since the points and are on a ray that emerges from the origin in , the distance term in line (19) is exactly , see Figure 3. Note that . Hence, the distance term in line (19) is

A Taylor expansion of the transition function centered at gives

To arrive at the last line, was used, which follows from the standard result [13, §3, Prop. 2.9, p. 65] with the usual identification of , cf. (11), (12).

By construction, is a Jacobi field along the geodesic ray that starts from with unit velocity , see [13, §5, Prop. 2.7, p. 114]. Moreover, constitute an orthonormal basis of the -plane in . Therefore, the results [13, §5, Cor. 2.9, Cor. 2.10, p. 115] apply and give

In summary,

which establishes the theorem.

Remark 2
  1. The the approximation error in the tangent space can be related by elementary trigonometry to the angle . It holds , see Fig. 3. Moreover, . Thus,

    In regards of practical applications, it is safe to assume . Then, in terms of error , the distance estimate (16) reads

    (20)
  2. If we travel from to in the tangent space on the corresponding curves as in the proof of Theorem 3, i.e. first along the circular arc from to and than along the ray from to , then we cover a distance of . Comparing this with (16), we see that the same distances of the images of these on the manifold are (asymptotically) if features sectional curvatures. The underlying principle is the well-known effect that geodesics on positively curved spaces spread apart less than straight rays in a flat space, while they spread apart more on negatively curved spaces, see [13, §5, Remark 2.11, p. 115/116].

    From the numerical point of view, this means that data processing operations that work via the transition to the tangent space are rather well-behaved on manifolds of positive curvature, while the opposite holds on negatively curved manifolds. In Section 4, we will show an illustration of Theorem 3 on an interpolation problem on the compact Stiefel manifold.

With the help of Theorem 3

, explicit error bounds for manifold interpolation methods can be obtained. For example, cubic Hermite interpolation comes with a standard error bound

[19, Thm 7.16] that applies to the interpolant in the tangent space. This can be forwarded to a manifold error via Theorem 3.

4 Cubic Hermite interpolation of column-orthogonal matrices

The set of column-orthogonal matrices

is the compact homogeneous matrix manifold known as the (compact) Stiefel manifold. This section reviews the essential aspects of the numerical treatment of Stiefel manifolds. For more details, see [2, 14, 38].

The tangent space at a point can be thought of as the space of velocity vectors of differentiable curves on passing through :

For any matrix representative , the tangent space of at is

Every tangent vector may be written as

(21)

The dimension of both and is .

Each tangent space carries an inner product with corresponding norm . This is called the canonical metric on . It is derived from the quotient space representation that identifies two square orthogonal matrices in as the same point on , if their first columns coincide [14, §2.4]. Endowing each tangent space with this metric (that varies differentiably in ) turns into a Riemannian manifold. The associated sectional curvature is non-negative and is bounded by for all and all two-plans , [30, §5].

Given a start point and an initial velocity the Stiefel geodesic (and thus the Riemannian exponential) is

(22)

where

is the decomposition of the tangent velocity into its horizontal and vertical component with respect to the base point , [14]. Because is tangent, is skew. The Riemannian Stiefel logarithm can be computed with the algorithm of [37].

4.1 Differentiating the Stiefel exponential

In this section, we compute the derivative

(23)

This is important for two reasons.

  1. Differentiable gluing of interpolation curves. Consider a manifold data set , , where the Riemannian distance, say, of the sample points and and and exceeds the injectivity radius of at . Then, simple tangent space interpolation with mapping the data set to is not possible. A remedy is to split the data set at and to compute two interpolation curves, one for the sample set , and one for the sample set , . With the canonical method of tangent space interpolation, the curves have the expressions and , where for and for . Concatenating the curves , will result in a non-differentiable kink at the intersection location , where ends and starts. In order to avoid this, one can compute the derivative and use as an Hermitian derivative sample when constructing . For obtaining , a derivative of the form of (23) must be computed.

  2. Method validation. The cubic Hermite manifold interpolation method of Theorem 2.2 requires the computation of . As was mentioned in Section 2.2, the differential of the -mapping cannot be computed explicitly for general manifolds . In order to assess the numerical quality of a finite-differences approximation, we can first compute by (15) and then recompute (9)

    The numerical accuracy is assessed via the error

    (24)

    Again, a derivative of the form of (23) must be computed.

Now, let us address the derivative (23) for