 # Simplicial splines for representation of density functions

In the context of functional data analysis, probability density functions as non-negative functions are characterized by specific properties of scale invariance and relative scale which enable to represent them with the unit integral constraint without loss of information. On the other hand, all these properties are a challenge when the densities need to be approximated with spline functions, including construction of the respective B-spline basis. The Bayes space methodology of density functions enables to express them as real functions in the standard L^2 space using the clr transformation. The resulting functions possess the zero integral constraint. This is a key to propose a new B-spline basis, holding the same property, and consequently to build a new class of spline functions, called simplicial splines, which can approximate probability density functions in a consistent way. The paper provides also construction of smoothing simplicial splines and possible orthonormalization of the B-spline basis which might be useful in some applications. Finally, a concise analysis is demonstrated with anthropometric data.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Probability density functions are popularly known as non-negative functions with the unit integral constraint. This clearly inhibits their direct processing using standard methods of functional data analysis  including any their approximation since unconstrained functions are assumed there. In addition, density functions are rather characterized by deeper geometrical properties that need to be taken into account for any reliable analysis [6, 21, 22]. Specifically, in contrast to functions in the standard space, densities obey the scale invariance and relative scale properties . Scale invariance means that not just the representation of densities with the unit integral constraint, but any its positive multiple conveys the same information about relative contributions of Borel sets on the whole probability mass. Relative scale can be explained directly with an example: the relative increase of a probability over a Borel set from 0.05 to 0.1 (2 multiple) differs from the increase 0.5 to 0.55 (1.1 multiple), although the absolute differences are the same in both cases. If we restrict to a bounded support that is mostly used in practical applications [3, 9, 14, 15], density functions can be represented with respect to Lebesgue reference measure using the Bayes space of functions with square-integrable logarithm [6, 22].

The Bayes space has structure of separable Hilbert space that enables to construct an isometric isomorphism between and , the space restricted to . Accordingly, analogies of summing two functions and multiplication of a function by a real scalar in the space together with an inner product between two densities are required. Given two absolutely integrable density functions and a real number we indicate with and the perturbation and powering operations, defined as

 (f⊕g)(x)=f(x)g(x)∫If(y)g(y)dy,(α⊙f)(x)=f(x)α∫If(y)αdy,x∈I, (1)

respectively. The resulting functions are readily seen to be probability density functions, though, note that the unit integral constraint representation was chosen just for the sake of convenient interpretation. In  it is proven that endowed with the operations

is a vector space. Note that the neutral elements of perturbation and powering are

, with (i.e., the uniform density), and 1, respectively. The difference between two elements , denoted by , is obtained as perturbation of with the reciprocal of , i.e., . Finally, to complete the Hilbert space structure, the inner product is defined as

 ⟨f,g⟩B=12η∫I∫Ilnf(x)f(y)lng(x)g(y)dxdy,f,g∈B2(I). (2)

Form of the inner product clearly indicates that the relevant information in densities is contained in (log-)ratios between elements from the support .

Density functions can be considered as functional counterparts to compositional data, positive vectors carrying relative information [1, 17] that are driven by the Aitchison geometry . In order to enable their statistical processing using standard multivariate methods in real space , the preferred strategy is to express them either in centered log-ratio coefficients  with respect to a generating system, or in logratio coordinates, preferably with respect to an orthonormal basis . The latter coordinates (called also isometric log-ratio coordinates), as well as the clr coefficients, provide isometry between the Aitchison geometry and the real Euclidean space. A similar strategy is used also for densities in the Bayes space . An isometric isomorphism between and is represented by the centered log-ratio (clr) transformation [22, 14], defined for as

 clr(f)(x)=fc(x)=lnf(x)−1η∫Ilnf(y)dy. (3)

We remark that such an isometry allows to compute operations and inner products among the elements in in terms of their counterpart in among the clr-transforms, i.e.

 clr(f⊕g)(x)=fc(x)+gc(x), clr(α⊙f)(x)=α⋅fc(x)

and

 ⟨f,g⟩B=⟨fc,gc⟩2=∫Ifc(x)gc(x)dx.

However, the clr transformation induces an additional constraint,

 ∫Iclr(f)(x)dx=∫Ilnf(x)dx−∫I1η∫Ilnf(y)dydx=0, (4)

that needs to be taken into account for computation and analysis on clr-transformed density functions. As the clr space is clearly a subspace of , hereafter it is denoted as . The inverse clr transformation is obtained as

 clr−1[fc(x)]=exp(fc(x))∫Iexp(fc(y))% dy; (5)

again as before, the denominator is used just to achieve the unit integral constraint representation of the resulting density (without loss of relative information, carried by the density function).

According to  it is not necessary to restrict ourselves to the constrained clr space, because a basis in can be easily constructed. Specifically, let is a basis in and assume that is a constant function, then form a basis in . Of course, also here an orthonormal basis is preferable, but it is not always possible in applications. Nevertheless, if this would be so, then a function can be projected orthogonally to the space spanned, e.g., by the first functions . This is done through the respective coefficients in the basis expansion

 f(x)=c1⊙φ1(x)⊕c2⊙φ2(x)⊕…⊕cr⊙φr(x)⊕…=∞⨁i=1ci⊙φi(x), x∈I. (6)

Functional data analysis relies strongly on approximation of the input functions using splines . However, splines are mostly utilized purely as an approximation tool, without considering further methodological consequences. Because statistical processing of density functions requires a deeper geometrical background, provided by the Bayes spaces, this should be followed also by the respective spline representation, performed preferably in the clr space . In  a first attempt of constructing a spline representation that would honor the zero integral constraint (4) was performed. The problem is that -splines that form basis for the spline expansion in  come from , but not from . This paper presents an important step ahead – such -splines are constructed that form basis functions in the clr space . Consequently, the -splines can be expressed also directly in and the spline representation formulated in terms of the Bayes space which can be used for interpretation purposes; hereafter we refer to simplicial splines. Apart from methodological advantages, using simplicial splines simplifies construction and interpretation of spline coefficients that can be considered as coefficients of a (possibly orthonormal) basis in .

The paper is organized as follows. In the next section the construction of -splines basis in is presented together with a comparison to spline functions introduced in . Section 3 is devoted to smoothing splines in and Section 4 discusses orthogonalization of basis functions (that form, by construction, an oblique basis). Section 5 introduces a new class of splines that reflect the Bayes spaces methodology, simplicial splines. Section 6 provides an application to anthropometric data and the final Section 7 concludes.

## 2 Construction of B-spline basis in L20(i)

Because the clr transformation enables to process density functions in the standard space, just restricted according to zero integral constraint (4), it is natural that also construction of simplicial splines should start in . Nevertheless, before doing so, some basic facts about -spline representation of splines are recalled, see [2, 4] for details. Let the sequence of knots

 Δλ:=λ0=a<λ1<…<λg

be given. -spline of degree (order ) is defined as

 B1i(x)={1ifx∈[λi,λi+1)0otherwise.

Consequently, -spline of degree , , , (order ) is defined by

 Bk+1i(x)=x−λiλi+k−λiBki(x)+λi+k+1−xλi+k+1−λi+1Bki+1(x).

Now let the new functions for , , be defined

 Zk+1i(x):=ddxBk+2i(x), (7)

i.e., more precisely for

 Z1i(x)={1ifx∈[λi,λi+1)−1ifx∈(λi+1,λi+2]

and for

 Zk+1i(x)=(k+1)(Bk+1i(x)λi+k+1−λi−Bk+1i+1(x)λi+k+2−λi+1). (8)

Noteworthy, functions have similar properties as -splines .

1. They are piecewise polynomials of degree . Particularly, is a piecewise constant polynomial, is a piecewise linear polynomial, see Figure 1, is a piecewise quadratic polynomial, see Figure 2. For other examples see Figures 3-5.

2. It is evident that for the function and its derivatives up to order are all continuous.

3. It is easy to check that for

 suppZk+1i(x)=suppBk+2i(x)=[λi,λi+k+2],

and of course

 Zk+1i(x)=0ifx∉[λi,λi+k+2].
4. From boundary conditions for -splines

 (Bk+2i(x))(l)∣∣∣x=λi=(Bk+2i(x))(l)∣∣∣x=λi+k+2=0l=0,⋯,k

we have

 (Zk+1i(x))(l)∣∣∣x=λi=(Zk+1i(x))(l)∣∣∣x=λi+k+2=0l=0,⋯,k−1.
5. From the perspective of a crucial point is that integral of equals to zero:

 +∞∫−∞Zk+1i(x)%dx=λi+k+2∫λiZk+1i(x)dx=λi+k+2∫λiddxBk+2i(x)dx==[Bk+2i(x)]λi+k+2λi=0. (9)

From these properties it is evident that functions are -splines with zero integral.

It is known that for the vector space of polynomial splines of degree , , defined on a finite interval with the sequence of knots , the dimension is

 dim(SΔλk[a,b])=g+k+1.

For the construction of all basis functions it is necessary to consider some additional knots. Without loss of generality we can add knots

 λ−k=⋯=λ−1=λ0=a,b=λg+1=λg+2=⋯=λg+k+1. (10)

Then every spline in has a unique representation

 sk(x)=g∑i=−kbiBk+1i(x). (11)

In ,  the splines with zero integral are studied. There is given the necessary and sufficient condition for -splines coefficients of these splines. However, typical -splines , thus ignoring the constraint (4) in for construction of the -spline basis, were used there.

Now, regarding the definition (7), we are able to use -splines which have zero integral on (denoted also as -splines in the sequel). In the following, denotes the vector space of polynomial splines of degree , defined on a finite interval with the sequence of knots and having zero integral on . With respect to the condition of the zero integral it is clear that

 dim(ZΔλk[a,b])=g+k,

for more details see , . With the additional knots (10) we can construct functions

 Zk+1−k(x),⋯,Zk+1g−1(x),

which are basis functions of the space . Then every spline has a unique representation

 sk(x)=g−1∑i=−kziZk+1i(x). (12)

Reduction of dimension for splines in by one is a very natural consequence of clr transformation of density functions. Note that this feature is present also for clr coefficients of compositional data .

###### Example 2.1.

We consider knots . The task is to find a cubic spline with the given sequence of knots and which has zero integral on the interval . It is evident that , . We consider the additional knots

 λ−3=λ−2=λ−1=λ0=a=0,20=b=λ5=λ6=λ7=λ8.

The basis functions of the space are plotted in Figure 6. Every spline can be written as

 s3(x)=3∑i=−3ziZ4i(x). (13)

Thus, e.g., for the cubic spline with zero integral is plotted in Figure 7.

Now let us compare splines from the spaces and . For this purpose we recall the following theorem, which was proven in .

###### Theorem 2.1.

For a spline , , the condition is fulfilled if and only if

By considering the above thoughts, we are now able to prove the following statement.

###### Theorem 2.2.

Every spline , is an element of the space and fulfills the condition .

###### Proof.

Every spline can be written in matrix notation as

 sk(x)=Bk+1(x)b,

where is the vector of basis functions of the space and is the vector of the corresponding -spline coefficients. Regarding Theorem 2.1 we know that

where and , .

With respect to (8) we are able to write the functions in matrix notation,

Then it is clear that

 (Zk+1−k(x),…,Zk+1g−1(x))=(Bk+1−k(x),…,Bk+1g(x))DK=Bk+1(x)DK,

where

 D=(k+1)diag(1λ1−λ−k,…,1λg+k+1−λg)=(k+1)diag(1l1,…,1lg+k+1) (14)

and

 K=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝100⋯00−110⋯000−11⋯00⋮⋮⋱⋱⋮⋮000⋯−11000⋯0−1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠∈Rg+k+1,g+k. (15)

Therefore the spline , can be written in matrix notation as

 sk(x)=Zk+1(x)z=Bk+1(x)DKz, (16)

where and . Furthermore, let us denote

 ˆb=DKz.

Then and it is evident that . To fulfill the condition of the zero integral on interval it is necessary and sufficient that

 l⊤ˆb=0.

In order to prove this equality we can write

 l⊤ˆb=l⊤DKz.

Because

 l⊤D=(k+1)(l1,…,lg+k+1)⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1l1⋱1lg+k+1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠=(k+1)(1,…,1)

and

 Kz=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝z−k−z−k+z−k+1⋮−zg−2+zg−1−zg−1,⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠,

it can be easily seen that

and the proof is complete. ∎

###### Remark 1.

The formula (16) is very useful, because we can use the standard -spline basis for working with splines honoring the zero integral constraint, which is very convenient from computational point of view.

## 3 Smoothing spline in L20(i)

In  the construction of smoothing splines in the space was studied, however using standard -spline basis functions . Now we are able to construct smoothing splines in this space with new basis functions . For this purpose, let data , , weights , , sequence of knots , and a parameter be given. For arbitrary our task is to find a spline , which minimizes the functional

 Jl(sk)=(1−α)∫ba[s(l)k(x)]2% dx+αn∑i=1wi[yi−sk(xi)]2.

Note that the choice of parameter and , where stands for th derivation, affects smoothness of the resulting spline. Let us denote , , and . Regarding the representation (12) and matrix notation (16) the functional can be written as a quadratic function

 Jl(z) =(1−α)z⊤K⊤DS⊤lMklSlDKz++α[y−Bk+1(x)DKz]⊤W[y−Bk+1(x)DKz], (17)

see [12, 11, 10] for details. In fact the matrix

 Mkl=(mklij)gi,j=−k+l,withmklij=b∫aBk+1−li(x)Bk+1−lj(x)dx (18)

is positive definite, because , are basis functions. Upper triangular matrix has full row rank. is a diagonal matrix such that

 Dj=(k+1−j)diag(d−k+j,…,dg)

with

 di=1λi+k+1−j−λi,i=−k+j,…,g,

and

 Lj:=⎛⎜ ⎜⎝−11⋱⋱−11⎞⎟ ⎟⎠∈Rg+k+1−j,g+k+2−j.

Finally, stands for the collocation matrix, i.e.

 Bk+1(x)=(Bk+1i(xj))n,gj=1,i=−k.

Using the notation ,

 G:=U⊤[(1−α)Sl⊤MklSl+αB⊤k+1(x)WBk+1(x)]U (19)

and

 g:=αK⊤DB⊤k+1(x)Wy,

it is possible to rewrite the quadratic function as

 Jl(z)=z⊤Gz−2z⊤g+αy⊤Wy. (20)

Our task is to find a spline which minimizes the functional , in other words, we want to find a minimum of the function (20). It is obvious that this function has just one minimum if and only if the matrix is positive definite (p.d.). From (19) it can be easily seen that

 G is p.d.⇔Bk+1(x)is of full column rank.

From Schoenberg-Whitney theorem and its generalization, see  and , it is known that matrix is of full column rank if and only if there exists with , , such that , . In this case from the necessary and sufficient condition for a unique minimum of quadratic function, i.e.

 ∂Jl(z)∂z⊤=0,

we get a system of linear equations and then the unique solution of this system is given by

 z∗=G−1g. (21)

Consequently, the resulting smoothing spline is obtained by formula

 s∗k(x)=g−1∑i=−kz∗iZk+1i(x),

in matrix notation using standard -splines as

 s∗k(x)=Bk+1(x)DKz∗,

where the vector is given in (21).

## 4 Orthogonalization of basis functions

A further step is to orthogonalize the basis

 Zk+1(x)=(Zk+1−k(x),…,Zk+1g−1(x))⊤

of the space that is by construction obligue with respect to the space metric. For this purpose the idea presented in 

is used. We search for a linear transformation

such that

 Ok+1(x)=ΦZk+1(x)

forms an orthogonal set of basis functions of the space , i.e.

 b∫aOk+1(x)O⊤k+1(x)dx=I.

Regarding the lemma presented in  and notation used here we can formulate the following statement.

###### Lemma 4.1.

An invertible transformation orthogonalizes the basis functions if and only if it satisfies the condition that

 Φ⊤Φ=Σ−1,

where represents the positive definite matrix

 Σ=b∫aZk+1(x)Z⊤k+1(x)dx=⎛⎜⎝b∫aZk+1i(x)Zk+1j(x)dx⎞⎟⎠g−1i,j=−k.

With respect to the definition of basis functions the matrix can be expressed as

 Σ=K⊤DMDK, (22)

where . The linear transformation is not unique and can be computed for example by the Cholesky decomposition. The basis functions

 Ok+1(x)=ΦZk+1(x),Ok+1(x)=(Ok+1−k(x),…,Ok+1g−1(x))⊤

are orthogonal and have a zero integral. The linear and quadratic -splines with zero integral and their ortogonalization are plotted in Figures 8 and 9.

To sum up, the spline with zero integral can be constructed as a linear combination of orthogonal -splines having zero integral in a form

 sk(x)=g−1∑i=−kziOk+1i(x)=Ok+1(x)z.

On the other hand, the standard and well-known -splines can be used to represent in matrix form

 sk(x)=ΦBk+1(x)DKz.

This formulation seems to be very useful because it allows us to use existing -spline codes in software R or Matlab, for example, the collocation matrix or computation of integrals in (18).

## 5 Simplicial splines in the Bayes space

Construction of splines directly in has important practical consequences, however, it is crucial also from the theoretical perspective. Expressing -splines as functions in enables to back-transform them to the original Bayes space using (5). It results in simplicial -splines (-splines), obtained from (8) as

 ζk+1i(x)=exp[Zk+1i(x)]∫Iexp[Zk+1i(y)]dy,i=−k,…,g−1, k≥0. (23)

Accordingly, for instance -splines from Figures 1-5 can be now expressed directly in the Bayes space as -splines, see Figures 10-12. Note that -splines fulfill the unit integral constraint which is, however, not necessary for further considerations. As a consequence, it is immediate to define vector space of simplicial polynomial splines of degree , defined on a finite interval with the sequence of knots . From isomorphism between and it holds that

 dim(CΔλk[a,b])=g+k.

Moreover, from isometric properties of clr transformation it follows that every simplicial spline in can be uniquely represented as

 ξk(x)=g−1⨁i=−kzi⊙ζk+1i(x). (24)

-splines forming the basis are by the default setting (8) not orthogonal. Their orthogonalization can be done as described in Section 4, i.e. by employing with the back-transformation to .