Simplicial splines for representation of density functions

05/16/2019 ∙ by Jitka Machalova, et al. ∙, a.s. 0

In the context of functional data analysis, probability density functions as non-negative functions are characterized by specific properties of scale invariance and relative scale which enable to represent them with the unit integral constraint without loss of information. On the other hand, all these properties are a challenge when the densities need to be approximated with spline functions, including construction of the respective B-spline basis. The Bayes space methodology of density functions enables to express them as real functions in the standard L^2 space using the clr transformation. The resulting functions possess the zero integral constraint. This is a key to propose a new B-spline basis, holding the same property, and consequently to build a new class of spline functions, called simplicial splines, which can approximate probability density functions in a consistent way. The paper provides also construction of smoothing simplicial splines and possible orthonormalization of the B-spline basis which might be useful in some applications. Finally, a concise analysis is demonstrated with anthropometric data.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Probability density functions are popularly known as non-negative functions with the unit integral constraint. This clearly inhibits their direct processing using standard methods of functional data analysis [18] including any their approximation since unconstrained functions are assumed there. In addition, density functions are rather characterized by deeper geometrical properties that need to be taken into account for any reliable analysis [6, 21, 22]. Specifically, in contrast to functions in the standard space, densities obey the scale invariance and relative scale properties [9]. Scale invariance means that not just the representation of densities with the unit integral constraint, but any its positive multiple conveys the same information about relative contributions of Borel sets on the whole probability mass. Relative scale can be explained directly with an example: the relative increase of a probability over a Borel set from 0.05 to 0.1 (2 multiple) differs from the increase 0.5 to 0.55 (1.1 multiple), although the absolute differences are the same in both cases. If we restrict to a bounded support that is mostly used in practical applications [3, 9, 14, 15], density functions can be represented with respect to Lebesgue reference measure using the Bayes space of functions with square-integrable logarithm [6, 22].

The Bayes space has structure of separable Hilbert space that enables to construct an isometric isomorphism between and , the space restricted to . Accordingly, analogies of summing two functions and multiplication of a function by a real scalar in the space together with an inner product between two densities are required. Given two absolutely integrable density functions and a real number we indicate with and the perturbation and powering operations, defined as


respectively. The resulting functions are readily seen to be probability density functions, though, note that the unit integral constraint representation was chosen just for the sake of convenient interpretation. In [6] it is proven that endowed with the operations

is a vector space. Note that the neutral elements of perturbation and powering are

, with (i.e., the uniform density), and 1, respectively. The difference between two elements , denoted by , is obtained as perturbation of with the reciprocal of , i.e., . Finally, to complete the Hilbert space structure, the inner product is defined as


Form of the inner product clearly indicates that the relevant information in densities is contained in (log-)ratios between elements from the support .

Density functions can be considered as functional counterparts to compositional data, positive vectors carrying relative information [1, 17] that are driven by the Aitchison geometry [16]. In order to enable their statistical processing using standard multivariate methods in real space [5], the preferred strategy is to express them either in centered log-ratio coefficients [1] with respect to a generating system, or in logratio coordinates, preferably with respect to an orthonormal basis [7]. The latter coordinates (called also isometric log-ratio coordinates), as well as the clr coefficients, provide isometry between the Aitchison geometry and the real Euclidean space. A similar strategy is used also for densities in the Bayes space [22]. An isometric isomorphism between and is represented by the centered log-ratio (clr) transformation [22, 14], defined for as


We remark that such an isometry allows to compute operations and inner products among the elements in in terms of their counterpart in among the clr-transforms, i.e.


However, the clr transformation induces an additional constraint,


that needs to be taken into account for computation and analysis on clr-transformed density functions. As the clr space is clearly a subspace of , hereafter it is denoted as . The inverse clr transformation is obtained as


again as before, the denominator is used just to achieve the unit integral constraint representation of the resulting density (without loss of relative information, carried by the density function).

According to [22] it is not necessary to restrict ourselves to the constrained clr space, because a basis in can be easily constructed. Specifically, let is a basis in and assume that is a constant function, then form a basis in . Of course, also here an orthonormal basis is preferable, but it is not always possible in applications. Nevertheless, if this would be so, then a function can be projected orthogonally to the space spanned, e.g., by the first functions . This is done through the respective coefficients in the basis expansion


Functional data analysis relies strongly on approximation of the input functions using splines [18]. However, splines are mostly utilized purely as an approximation tool, without considering further methodological consequences. Because statistical processing of density functions requires a deeper geometrical background, provided by the Bayes spaces, this should be followed also by the respective spline representation, performed preferably in the clr space . In [12] a first attempt of constructing a spline representation that would honor the zero integral constraint (4) was performed. The problem is that -splines that form basis for the spline expansion in [12] come from , but not from . This paper presents an important step ahead – such -splines are constructed that form basis functions in the clr space . Consequently, the -splines can be expressed also directly in and the spline representation formulated in terms of the Bayes space which can be used for interpretation purposes; hereafter we refer to simplicial splines. Apart from methodological advantages, using simplicial splines simplifies construction and interpretation of spline coefficients that can be considered as coefficients of a (possibly orthonormal) basis in .

The paper is organized as follows. In the next section the construction of -splines basis in is presented together with a comparison to spline functions introduced in [12]. Section 3 is devoted to smoothing splines in and Section 4 discusses orthogonalization of basis functions (that form, by construction, an oblique basis). Section 5 introduces a new class of splines that reflect the Bayes spaces methodology, simplicial splines. Section 6 provides an application to anthropometric data and the final Section 7 concludes.

2 Construction of -spline basis in

Because the clr transformation enables to process density functions in the standard space, just restricted according to zero integral constraint (4), it is natural that also construction of simplicial splines should start in . Nevertheless, before doing so, some basic facts about -spline representation of splines are recalled, see [2, 4] for details. Let the sequence of knots

be given. -spline of degree (order ) is defined as

Consequently, -spline of degree , , , (order ) is defined by

Now let the new functions for , , be defined


i.e., more precisely for

and for


Noteworthy, functions have similar properties as -splines .

  1. They are piecewise polynomials of degree . Particularly, is a piecewise constant polynomial, is a piecewise linear polynomial, see Figure 1, is a piecewise quadratic polynomial, see Figure 2. For other examples see Figures 3-5.

  2. It is evident that for the function and its derivatives up to order are all continuous.

  3. It is easy to check that for

    and of course

  4. From boundary conditions for -splines

    we have

  5. From the perspective of a crucial point is that integral of equals to zero:


From these properties it is evident that functions are -splines with zero integral.

Figure 1: Linear -spline with equidistant knots .
Figure 2: Quadratic -spline with equidistant knots .
Figure 3: Cubic -spline with equidistant knots .
Figure 4: Linear -spline with nonequidistant knots .
Figure 5: Quadratic -spline with nonequidistant knots .

It is known that for the vector space of polynomial splines of degree , , defined on a finite interval with the sequence of knots , the dimension is

For the construction of all basis functions it is necessary to consider some additional knots. Without loss of generality we can add knots


Then every spline in has a unique representation


In [12], [20] the splines with zero integral are studied. There is given the necessary and sufficient condition for -splines coefficients of these splines. However, typical -splines , thus ignoring the constraint (4) in for construction of the -spline basis, were used there.

Now, regarding the definition (7), we are able to use -splines which have zero integral on (denoted also as -splines in the sequel). In the following, denotes the vector space of polynomial splines of degree , defined on a finite interval with the sequence of knots and having zero integral on . With respect to the condition of the zero integral it is clear that

for more details see [12], [20]. With the additional knots (10) we can construct functions

which are basis functions of the space . Then every spline has a unique representation


Reduction of dimension for splines in by one is a very natural consequence of clr transformation of density functions. Note that this feature is present also for clr coefficients of compositional data [1].

Figure 6: Basis splines for the space .
Example 2.1.

We consider knots . The task is to find a cubic spline with the given sequence of knots and which has zero integral on the interval . It is evident that , . We consider the additional knots

The basis functions of the space are plotted in Figure 6. Every spline can be written as


Thus, e.g., for the cubic spline with zero integral is plotted in Figure 7.

Figure 7: Cubic spline with given coefficients .

Now let us compare splines from the spaces and . For this purpose we recall the following theorem, which was proven in [20].

Theorem 2.1.

For a spline , , the condition is fulfilled if and only if

By considering the above thoughts, we are now able to prove the following statement.

Theorem 2.2.

Every spline , is an element of the space and fulfills the condition .


Every spline can be written in matrix notation as

where is the vector of basis functions of the space and is the vector of the corresponding -spline coefficients. Regarding Theorem 2.1 we know that

where and , .

With respect to (8) we are able to write the functions in matrix notation,

Then it is clear that





Therefore the spline , can be written in matrix notation as


where and . Furthermore, let us denote

Then and it is evident that . To fulfill the condition of the zero integral on interval it is necessary and sufficient that

In order to prove this equality we can write



it can be easily seen that

and the proof is complete. ∎

Remark 1.

The formula (16) is very useful, because we can use the standard -spline basis for working with splines honoring the zero integral constraint, which is very convenient from computational point of view.

3 Smoothing spline in

In [12] the construction of smoothing splines in the space was studied, however using standard -spline basis functions . Now we are able to construct smoothing splines in this space with new basis functions . For this purpose, let data , , weights , , sequence of knots , and a parameter be given. For arbitrary our task is to find a spline , which minimizes the functional

Note that the choice of parameter and , where stands for th derivation, affects smoothness of the resulting spline. Let us denote , , and . Regarding the representation (12) and matrix notation (16) the functional can be written as a quadratic function


see [12, 11, 10] for details. In fact the matrix


is positive definite, because , are basis functions. Upper triangular matrix has full row rank. is a diagonal matrix such that



Finally, stands for the collocation matrix, i.e.

Using the notation ,



it is possible to rewrite the quadratic function as


Our task is to find a spline which minimizes the functional , in other words, we want to find a minimum of the function (20). It is obvious that this function has just one minimum if and only if the matrix is positive definite (p.d.). From (19) it can be easily seen that

From Schoenberg-Whitney theorem and its generalization, see [2] and [10], it is known that matrix is of full column rank if and only if there exists with , , such that , . In this case from the necessary and sufficient condition for a unique minimum of quadratic function, i.e.

we get a system of linear equations and then the unique solution of this system is given by


Consequently, the resulting smoothing spline is obtained by formula

in matrix notation using standard -splines as

where the vector is given in (21).

4 Orthogonalization of basis functions

A further step is to orthogonalize the basis

of the space that is by construction obligue with respect to the space metric. For this purpose the idea presented in [19]

is used. We search for a linear transformation

such that

forms an orthogonal set of basis functions of the space , i.e.

Regarding the lemma presented in [19] and notation used here we can formulate the following statement.

Lemma 4.1.

An invertible transformation orthogonalizes the basis functions if and only if it satisfies the condition that

where represents the positive definite matrix

With respect to the definition of basis functions the matrix can be expressed as


where . The linear transformation is not unique and can be computed for example by the Cholesky decomposition. The basis functions

are orthogonal and have a zero integral. The linear and quadratic -splines with zero integral and their ortogonalization are plotted in Figures 8 and 9.

Figure 8: Linear -splines with given knots (left), linear orthogonal -splines (right).
Figure 9: Quadratic -splines with given knots (left), quadratic orthogonal -splines (right).

To sum up, the spline with zero integral can be constructed as a linear combination of orthogonal -splines having zero integral in a form

On the other hand, the standard and well-known -splines can be used to represent in matrix form

This formulation seems to be very useful because it allows us to use existing -spline codes in software R or Matlab, for example, the collocation matrix or computation of integrals in (18).

5 Simplicial splines in the Bayes space

Construction of splines directly in has important practical consequences, however, it is crucial also from the theoretical perspective. Expressing -splines as functions in enables to back-transform them to the original Bayes space using (5). It results in simplicial -splines (-splines), obtained from (8) as


Accordingly, for instance -splines from Figures 1-5 can be now expressed directly in the Bayes space as -splines, see Figures 10-12. Note that -splines fulfill the unit integral constraint which is, however, not necessary for further considerations. As a consequence, it is immediate to define vector space of simplicial polynomial splines of degree , defined on a finite interval with the sequence of knots . From isomorphism between and it holds that

Moreover, from isometric properties of clr transformation it follows that every simplicial spline in can be uniquely represented as


-splines forming the basis are by the default setting (8) not orthogonal. Their orthogonalization can be done as described in Section 4, i.e. by employing with the back-transformation to .

Figure 10: Linear -spline (left) and linear -spline (right) with equidistant knots .
Figure 11: Quadratic -spline (left) and quadratic -spline (right) with equidistant knots .
Figure 12: Quadratic -spline (left) and quadratic -spline (right) with nonequidistant knots .

The resulting simplicial splines (with either orthogonal, or non-orthogonal -spline basis) can be used for representation of densities directly in . This is an important step in construction of methods of functional data analysis involving density functions, like for ANOVA modeling [22] or for the SFPCA method introduced in [9] and demonstrated in the next section. In the latter case -splines were indeed already used for construction of the procedure, although at the respective level of development the authors were not aware of that. With -splines one has a guarantee that methods are developed consistently in the Bayes space. Moreover, the possibility of having an orthogonal basis enables to gain additional features resulting from orthogonality of finite dimensional projection in combination with approximate properties of spline functions.

As usual, simplicial splines can be tuned according to concrete problem, with the advantage of their direct formulation in the Bayes space sense.

Example 5.1.

To illustrate smoothing of concrete data with a simplicial spline, 1000 values from standard normal distribution were simulated and the support was determined by minimum and maximum simulated values,

. Data were collected in a form of histogram, where the breakpoints are equidistantly spaced. The representative points of the histogram cells are denoted as asterisks in Figure 13 (left) and these points form the input data , for smoothing purpose. The -values stand for discretized relative contributions to the overall probability mass, therefore discrete clr transformation [1] is needed to obtain a real vector with zero sum constraint (Figure 13, right). These data points were smoothed using the procedure from Section 3 and back-transformed to the original space. In the concrete setting , , , , ,