Spatial Blind Source Separation in the Presence of a Drift

Multivariate measurements taken at different spatial locations occur frequently in practice. Proper analysis of such data needs to consider not only dependencies on-sight but also dependencies in and in-between variables as a function of spatial separation. Spatial Blind Source Separation (SBSS) is a recently developed unsupervised statistical tool that deals with such data by assuming that the observable data is formed by a linear latent variable model. In SBSS the latent variable is assumed to be constituted by weakly stationary random fields which are uncorrelated. Such a model is appealing as further analysis can be carried out on the marginal distributions of the latent variables, interpretations are straightforward as the model is assumed to be linear, and not all components of the latent field might be of interest which acts as a form of dimension reduction. The weakly stationarity assumption of SBSS implies that the mean of the data is constant for all sample locations, which might be too restricting in practical applications. Therefore, an adaptation of SBSS that uses scatter matrices based on differences was recently suggested in the literature. In our contribution we formalize these ideas, suggest an adapted SBSS method and show its usefulness on synthetic and real data.



There are no comments yet.


page 7

page 12


Blind source separation for non-stationary random fields

Regional data analysis is concerned with the analysis and modeling of me...

Spatial Blind Source Separation

Recently a blind source separation model was suggested for multivariate ...

Test of the Latent Dimension of a Spatial Blind Source Separation Model

We assume a spatial blind source separation model in which the observed ...

Time Series Source Separation with Slow Flows

In this paper, we show that slow feature analysis (SFA), a common time s...

Bayesian Non-Parametric Multi-Source Modelling Based Determined Blind Source Separation

This paper proposes a determined blind source separation method using Ba...

Visual Parameter Selection for Spatial Blind Source Separation

Multivariate measurements at irregularly-spaced points and their analysi...

ABACUS: Unsupervised Multivariate Change Detection via Bayesian Source Separation

Change detection involves segmenting sequential data such that observati...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Data that are collected at different spatial locations occur quite often in statistical modeling. Pollution measurements in cities, element concentration in mines or socioeconomic variables across countries are examples of measurements which show dependencies as a function of spatial separation. With advanced technology data collection is significantly improved which leads to higher numbers of sample locations as well as measured variables. Therefore, proper statistical tools for such multivariate spatial data need to consider spatial dependencies in and in-between variables of interest.

In a first exploratory step, multivariate spatial data might be analyzed by using the popular Principal Component Analysis (PCA), where variance maximizing orthogonal transformations of the data are found. The disadvantage of this method in the present context is obvious. Only variance is maximized which ignores the spatial second order dependencies completely, orthogonal transformations might be too restricting, and the result is dependent on the scale of the data.

[21, 2]

introduced Spatial Blind Source Separation (SBSS) as an adaptation of Blind Source Separation (BSS) for spatial data, which overcomes the significant drawbacks of PCA. BSS is a framework which originates from signal processing and assumes that the multivariate data at hand are formed by linear transformations (not necessary orthogonal) of unobserved variables. The aim of BSS is to estimate these latent variables which might show a clearer structure and reveal the driving processes of the data. BSS is well-established for many types of data, such as independent and identically distributed (iid) data, where it is denoted as Independent Component Analysis (ICA),

[20] time series data [24] or tensorial data [29]. For general overviews see also [7, 8]. SBSS, in detail described in Section 2

, assumes that the observed multivariate data are formed by (spatially) uncorrelated, weakly stationary latent random fields. The motivation behind this assumptions is clear. Firstly, as the entries of the latent random field are uncorrelated, univariate analysis can be carried out individually which discards demanding multivariate approaches,

[19]. Secondly, the latent random field is found by maximizing second order spatial dependence but interpretations are still straightforward as the identified transformations are linear. Lastly, only certain components might be of interest for further analysis or the domain expert which acts as a way of dimension reduction. SBSS finds the latent random field by jointly diagonalizing the covariance and so-called local covariance matrices that capture second-order spatial dependence.

The original assumption of SBSS that the entries of the latent field are second order stationary might be too restricting in practical considerations. In many situations it is more natural to consider a drift in the data which violates the non-constant mean assumption. [17] considered that case by replacing local covariance matrices with local difference matrices, which avoids the estimation of the mean and is practically more robust in the presence of a drift. Section 3 is devoted to the use of local difference matrices in SBSS, puts the ideas of [17] on a solid basis and introduces a new SBSS method with an adapted whitening step. The usefulness of this new method is validated on synthetic datasets in Section 4 and illustrated on a geochemical dataset in Section 5. Section 6 presents concluding remarks and an outlook for upcoming research.

2 Spatial Blind Source Separation

BSS for spatial data is a relatively new field in geostatistics. It was first introduced by [21] where the method was motivated by a geochemical application. [2] put SBSS on a sound theoretical basis, refined SBSS, and derived asymptotic properties for the estimators. Both publications are based on the following statistical model.

Definition 1 (SBSS model)

A -variate random field defined on a -dimensional spatial domain follows a spatial blind source separation model if it can be written as


for all , where is the full-rank deterministic mixing matrix, is the constant,

-dimensional, deterministic drift vector and

is the -variate latent random field which fulfills the following assumptions.

(SBSS 1):

for all ,

(SBSS 2):

for all and

(SBSS 3):

for all with , where is a diagonal matrix containing the stationary covariance functions of each entry of as diagonal elements.

Note that the SBSS model is a semi-parametric linear latent variable model as

is unobserved and only assumptions on the first two moments are stated. Specifically, the latent field

is constituted by uncorrelated weakly stationary univariate random fields. Furthermore, is only defined up to sign and order as still fulfills all assumptions stated above, here is a permutation matrix and is a diagonal matrix where each diagonal element is either or . In this context this is of minor importance as the sign and order might be determined by the context of the analysis or is not of interest at all. Note also that the scale is not identifiable for BSS in general, but in the model stated above it is fixed by assumption (SBSS 2) to be unity; this assumption will be given up in a subsequent section.

The aim of SBSS is to recover up to sign and permutations only based on one given realization of on sample locations by estimating the so-called unmixing matrix and the drift . Generally, the estimation of

is usually carried out in a two-step procedure by almost all BSS methods as follows. The singular value decomposition of the mixing matrix yields

which determines the covariance matrix of to be . As the covariance matrix is positive definite by assumption one can whiten by which equals by plugging in Equation (1). This suggests to firstly whiten

and then to find only an orthogonal matrix to recover

. More details and a proper mathematical derivation can be found in [16].

In order to find the orthogonal matrix after the whitening step, [21] and [2] introduced local covariance matrices (LCov) as a key tool for SBSS, which are defined as

LCov matrices are weighted averages of the covariance matrices between all possible pairs of given sample locations, where the weights are determined by the spatial kernel function . [21] suggested ball kernel functions which only consider locations that are separated by a maximum distance of . [2] additionally considered ring and Gauss kernel functions. The ring kernel function is written as , which considers locations that are separated by a minimum of and a maximum of . A smooth version of the ball kernel function is the Gauss kernel function where is the quantile of a standard Normal distribution. Generally, the kernel function can be of different shapes where for examples anisotropies can be modeled by accounting also for the direction of . LCov matrices are symmetric if the kernel function is symmetric, i.e. . Note that when , LCov matrices reduce to the usual covariance matrix, in the following denoted as .

Considering the SBSS model above, the LCov matrices evaluated on yield a diagonal matrix which motivates the following method seen in [2].

Definition 2

For a random field following the SBSS model in Definition 1 the unmixing matrix functional simultaneously diagonalizes the covariance matrix and one local covariance matrix for a given kernel function such that

where is a diagonal matrix with decreasingly ordered diagonal elements.

For the above method the exact diagonalizer can be found by solving the generalized eigenproblem, or equivalently in a two step fashion by firstly whitening the data as and then performing an eigendecomposition of . Note that the ordering of the diagonal elements of fixes the order of the components of the latent field. As only one specific LCov matrix is utilized this method is very sensitive to its specific spatial kernel function choice. One workaround is suggested by [2] leading to the following definition.

Definition 3

Consider a random field following the SBSS model in Definition 1. The whitened version of is defined by . For a given set of kernel functions , is the orthogonal joint diagonalization matrix which maximizes

Then, the unmixing matrix functional is given by .

In Definition 3, denotes the diagonal matrix where the diagonal elements are the ones of the matrix , and denotes the Frobenius matrix norm. For a given realization of on sample locations the matrices usually do not commute, therefore, algorithms that approximately jointly diagonalize these matrices have to be used. We use [5] which is based on iterative applications of Givens rotations. In contrast to Definition 2 the method of Definition 3 is more robust to the choice of spatial kernel functions but relies on a more sophisticated joint diagonalization algorithm.

The above two methods are formulated for the case of the SBSS model where the drift is assumed to be constant for all sample locations. Practically, when considering data with a present non-constant drift function, the performance of the SBSS methods above can be heavily downgraded when the population moments are replaced by their sample counterparts, especially when considering that is estimated by the sample mean for all . This effect might be reduced by relying on differences, which is usually done in geostatistics where the variogram is favored over the covariance matrix. We will adapt this idea for SBSS in the following.

3 Local Difference Matrices for SBSS

Relying on differences is common practice in geostatistics where the variogram is the central quantity of structural analysis, much favored over the spatial covariance matrix, see textbooks such as [6]. Differences are also beneficial for iid or time series data. In the latter case often differences are used to stabilize the mean, or remove seasonality effects. A popular model that is based on differences is the Autoregressive Integrated Moving Average (ARIMA) model, in which differences are modeled as an Autoregressive Moving Average (ARMA) model, see for example [14]. For the iid case [22] emphasized that only scatter matrices which can be expressed in terms of differences possess the property that they are in every distributional case diagonal when the random vector is formed by statistically independent entries. The covariance matrix can be written in terms of differences. Furthermore, they showed theoretically that robust scatter matrices have this property as well when evaluated on differences, and practically that statistical methods such as independent component analysis (ICA) or observational regression greatly benefit from the use of differences. Therefore, we adapt local covariance matrices and introduce the following definition of local difference (LDiff) matrices, as firstly mentioned in [17].

Again, is the spatial kernel function which determines the locality of the weighted average of differences, and the same rules apply as for LCov matrices. Note that when is chosen to be a slightly adapted version of the ring kernel, then LDiff matrices yield the typical semivariogram. Practically, for a given realization of on sample locations, the key difference between LCov and LDiff matrices in this context is that LCov matrices rely on the sample average for the drift estimate, whereas LDiff matrices do not. Theoretically, when the random field has a non-constant drift function the bias is of the form for LDiff matrices. This bias is expected to be small when the kernel function is designed in such a way that the locations and are nearby and under the assumption that the drift appears locally constant and would for example change only smoothly. Therefore, we suggest to replace the LCov matrix by a LDiff matrix in Definition 2. It would also be possible to adapt Definition 3 in such a way, but as usually the set of kernel functions is formed by spatial kernel functions with increasing parameters, the bias of LDiff matrices will be considerably higher. Therefore, we only consider the adaptation of Definition 2.

Definition 4

For a random field following the SBSS model in Definition 1 the unmixing matrix functional simultaneously diagonalizes the covariance matrix and one local difference matrix for a given kernel function such that

Where is a diagonal matrix with increasingly ordered diagonal elements.

Again, can be found by solving the generalized eigenproblem or by the two step algorithm discussed above. Note that in the above definition the whitening step relies on the use of which still shows the disadvantages discussed above. Therefore, we suggest the following adaptation.

3.1 Adaptation of the Whitening Step

To robustify the whitening step in the presence of a drift we suggest to replace the matrix by some LDiff matrix. A similar procedure was already formulated by [23] in the context of ICA, where the unmixing matrix is found by simultaneous diagonalization of two scatter matrices and where both not necessarily need to be the covariance matrix. Similarly, this idea was also developed for time series BSS in [3, 11]

where the whitening step is carried out by using a linear combination of symmetrized autocorrelation matrices to be robust in the presence of additive white noise.

As stated above under Assumption (SBSS 2), is an orthogonal matrix, where . This ensures that only an orthogonal matrix needs to be found when recovering after the data is whitened with respect to . Assumption (SBSS 2) can be replaced by assuming that which leads to being orthogonal as well when . In that case the whitening step is carried out with respect to and it is again ensured that only an orthogonal matrix needs to be found when recovering . This comes with the advantage that also in the whitening step the local covariance matrix is replaced by a local difference matrix at the cost of the fixed scale of . In a similar context this procedure is denoted as Robust Orthogonalization [11]. This outline leads to the following definition.

Definition 5

Consider two spatial kernel functions and . For a random field following the SBSS model in Definition 1 where Assumption (SBSS 2) is replaced by (SBSS : . The unmixing matrix functional simultaneously diagonalizes the corresponding two local difference matrices such that

Where is a diagonal matrix with increasingly ordered diagonal elements.

As shown in [7], Theorem 4.2, the mixing matrix can be found when is symmetric positive definite and

is symmetric and the generalized eigenvalues (diagonal elements of

) are distinct. It is easy to see that LDiff matrices are symmetric and positive definite in the context of Definition 1 when considering ball, ring and Gauss kernel functions with strictly positive parameters. Again as in Definition 2 the unmixing matrix can be alternatively found in a two step fashion where the whitening is now carried out by . Note that for this adapted whitening procedure is not necessarily diagonal but is diagonal but not equal to . Therefore, in this adaptation the convenience of fixing the scale of is given up for the sake of practical more robust whitening with respect to a present drift. In practical considerations might be standardized by scaling each component to unit variance. We also want to emphasize that the whitening step in Definition 2 and 3 can be adapted in a similar way by replacing by for some spatial kernel function , but in contrast to the above outline is not necessarily positive definite which might be overcome by the approach described in [3].

3.2 Comments on the Drift

For data that is generated by the model of Definition 1, the latent random field can be recovered by up to sign, order (and also scale when considering the context of Definition 5). But often one finds in practical considerations that the data at hand shows a non-constant drift which violates the SBSS model assumption of a constant mean. In such a situation we suggest to use the methods of Definition 4 or 5 for the reasons outlined above, compute and handle the present drift in one of the following forms based on the aim of further analysis of the data. All the following relies on the fact that consists of uncorrelated entries, which is the great advantage of the SBSS framework.

If one is interested in predicting at some unobserved location , each entry of can be treated individually. Universal Kriging would be the natural choice as it is designed for the presence of a drift, but also any other prediction tool might be used, see textbooks such as [6]. After predicting each entry of individually, the vector of predictions can be formed and transformed by to obtain a predictions of the original field . This discards the use of multivariate prediction tools in favor of univariate ones, which reduces the complexity of modeling significantly. This procedure was already investigated and described in detail for the constant drift case in [19].

If the aim of further analysis is to recover the latent field without any drift, each entry of the present drift of can be estimated individually, for example by a univariate Kriging estimator. As before the vector of predictions can be formed and subtracted from , which results in an practical estimation of the latent random field without drift. Additionally, the predicted drift can be back-transformed by to obtain an estimation of the original drift of . As before the great advantage is that this procedure replaces the use of one multivariate model by univariate ones.

3.3 SBSS for Intrinsic Stationary Random Fields

In the following we discuss an adaptation of the SBSS model in Definition 1 for which the procedure of Definition 5 is the natural choice. We adapt the SBSS model (Definition 1) by assuming that is formed by uncorrelated intrinsic stationary random fields. More formally this yields to replace assumptions (SBSS 2) and (SBSS 3) in the following way.

(SBSS ):

for all where is a diagonal matrix with strictly positive diagonal entries, and

(SBSS ):

for all with , where is diagonal holding two times the variogram for each entry of on its diagonal.

For this model the estimation of can be highly corrupted by the fact that the spatial covariance function for intrinsic stationary random fields is a function of the sample locations themselves and not of the distance between them. In contrast, the estimation of LDiff matrices is not corrupted as the variograms are still only dependent on the distance between sample locations. Therefore, the only choice for estimating the unmixing matrix which is consistent with the adapted SBSS model is the one from Definition 5, something we will also confirm by the simulations in the following section.

Figure 1: Example sample locations for the uniform pattern for a domain of size

(left) and for the skew pattern for a domain of size

(right). The circles of radii 1, 2 and 3 represent the kernel function choices for SBSS.

4 Simulations

In this part we validate the performance of our above introduced estimators by carrying out experiments on synthetic datasets. We use the statistical software R 3.6.1 ([25]) with the help of the packages JADE ([15]), SpatialBSS ([18]) and RandomFields ([27]).

We consider the case of and domains of the form with denoted as where the sample locations are either following a uniform or a skewed design. The distributions for the entries of sample locations are and for the uniform setting and and for the skewed setting, where and

denote the uniform and beta distributions, respectively. A set of sample locations for a given domain is formed by sampling

iid samples of the former distributions which are then multiplied by . Figure 1 depicts one example for the uniform pattern on the left panel and one skewed pattern example on the right panel. The considered random field models which are simulated on the sample location patterns are discussed in the subsequent chapters.

Figure 2: Mátern covariance functions for the entries of the latent random field . The parameters are chosen to be .

For a given random field we estimate the unmixing matrix by the following five BSS methods. The first two estimators are SBSS with LCov matrices from Definition 2 using a ring kernel with the parameter (LCov Ball), and Definition 3 using three ring kernels with parameters (LCov Ring). Furthermore, we use two SBSS estimators with LDiff matrices, namely the one from Definition 4 using a ring kernel with the parameter (LDiff Ball), and the one from Definition 5 where is the ring kernel function with and is also a ring kernel function with (wLDiff Ring). Lastly, we use the well known forth order blind identification (FOBI) algorithm [4] which is an ICA method designed for iid data and therefore does not utilize spatial dependence information at all.

We evaluate the quality of the unmixing matrix estimation by computing the minimum distance index (MDI) [12, 13] which is defined by

Here, is the unmixing matrix estimate, and is the set of all matrices of the form , where is a permutation matrix, is a diagonal matrix with positive diagonal entries and is a sign change matrix. As discussed above, should equal up to scale, sign and order which are exactly the indeterminacies captured by the set . Loosely, the MDI measures the deviation of from in respect of the indeterminacies, where it takes values between , indicating a perfect estimate, and , indicating a poor estimate. Note that the MDI only depends on which is always equal to the unmixing matrix evaluated for the case of for affine equivariant BSS methods, see for example [2] for details. Hence, the results are equal independently of the specific form of , which favors as a convenient choice for simulations. In the following we present results for two different random field models.

Figure 3: Drifts for one simulation replicate of drift Models 2-4 on a domain of size .

4.1 Weakly Stationary Latent Field with External Drift

In this simulation, we consider the random field model of Equation (1) where we chose and . The latent random field is formed by three centered Gaussian random fields, where the second order dependence is determined by the well-known second order stationary Mátern covariance function (see for example [6]) defined by

Here, and are the variance, shape and range parameters respectively. denotes the modified Bessel function of second kind. The parameters are chosen to be which are illustrated in Figure 2. For the -dimensional drift function we consider the following four settings.

Drift 1:

For this case we choose for all sample locations. Therefore, this model reduces to the constant drift case which was examined in detail in [2, 21].

Drift 2:

This drift is radial-symmetric in its nature, specifically, it is formed by for where is one artificial sample location sampled uniformly inside the domain at hand. Hence, for a domain of size the entries of the artificial location are drawn from . The constants are chosen to be . Theoretically, the drift ranges between for a domain up to for a domain.

Drift 3:

Here, we consider a linear drift in the first direction of the sample locations. For , equals where the constants are . is the first entry of the sample locations and is the maximum value of all first entries for the given set of sample locations. Therefore, the trend ranges between for all different domain sizes.

Drift 4:

In the same fashion as in Drift 2 we sample three artificial sample locations inside the domain at hand. These artificial locations define three cluster centers, a point belongs to the cluster where the Euclidean distance to the cluster center is minimal. For each cluster the corresponding drift

is a sample from the uniform distribution

. Consequently, the drift for this model lies in the interval . Overall, this setting is constituted by an observable random field that is weakly stationary in each cluster of sample locations, which often denoted as a block-stationary model.

One example for the different drift settings described above is depicted in Figure 3. Note that the on-sight variance for each entry of

equals one, moreover, the latent field is Gaussian distributed. Therefore, over 99% of the values of

are in the interval which highlights the impact of the ranges of the different drift settings.

The average MDI values based on 2000 repetitions for the uniform coordinate pattern are presented in Figure 4. Overall, for all considered drift settings the SBSS methods that are based on local difference matrices show much better performance. This is especially surprising for Drift 1 as it follows the original SBSS model. The adapted whitening step seems to be of minor influence with Drift 4 being an exception, in this setting the performance of LDiff Ball seems to stay constant even if the sample size increases. Similarly, the performance difference between SBSS that simultaneously or jointly diagonalize LCov matrices is not as significant. Figure 5 depicts the average MDI for the skewed sample location pattern. The qualitative meaning is very similar to the uniform setting with two minor exceptions. Firstly, the overall MDI is slightly higher for all settings and estimators, this might be explained by the fact that sample locations are only dense on the left part of the domain. Therefore, the effective sample size decreases when local covariance or local difference matrices are estimated, as certain sample locations do not have neighbors that are captured by the spatial kernel functions. Secondly, SBSS methods that are based on LDiff matrices do not suffer such a significant reduction of performance as the ones using LCov matrices.

Figure 4: Average MDI based on 2000 iterations for the uniform sample location pattern where the observed random field is formed by a weakly stationary latent field and four different additive drift functions.
Figure 5: Average MDI based on 2000 iterations for the skewed sample location pattern where the observed random field is formed by a weakly stationary latent field and four different additive drift functions.

4.2 Intrinsic Stationary Latent Field with Constant Drift

In this simulation we again consider the SBSS model of Equation (1) where , and . In contrast to Definition 1 the latent random field is now formed by uncorrelated intrinsic stationary random fields, which is an example for the outline in Section 3.3. Specifically, we chose the second order dependence of the entries of to follow the fractional Brownian field covariance function (see for example [9]) defined by

In this equation, is denoted as the Hurst parameter. The fractional Brownian field is a well-known intrinsic stationary but not weakly stationary random field model. We choose the Hurst parameters for the latent field to be , where Figure 6 depicts one example.

Figure 6: One example for the latent random field which follows the fractional Brownian field model on a domain with uniform sample location pattern. The Hurst parameters equal 0.3, 0.5 and 0.8 for , and respectively.

The average MDI values based on 2000 simulation iterations for the uniform and skewed sample location pattern are presented in Figure 7. The SBSS methods that only rely on the use of LCov matrices show a very poor performance. This is expected as the spatial covariance of depends on the actual sample location, which in turn leads to the fact that proper estimation of LCov matrices is impossible. In contrast, the SBSS methods relying on LDiff matrices still perform well, because, the variance of the difference processes of the random field at hand still depends only on the distance between sample locations. Clearly, the method with the adapted whitening step shows the overall best performance as also the whitening step is not corrupted by the intrinsic stationary nature of the observed random field .

Figure 7: Average MDI based on 2000 repetitions for the simulation with intrinsic stationary latent fields.

5 Data Example

In the following we illustrate the usefulness of our proposed adaptation of SBSS on a dataset derived from the GEMAS project [26], which was also considered in [17]. The dataset is freely available in the R package robCompositions ([28]) and it is formed by concentration measurements in mg/kg of 18 elements (Al, Ba, Ca, Cr, Fe, K, Mg, Mn, Na, Nb, P, Si, Sr, Ti, V, Y, Zn, Zr) at 2107 samples of agricultural soils in Europe, with locations given in latitude and longitude coordinates.

As it is common practice in regional geochemistry we respect the relative information of the data by using principles of compositional data analysis [1]. Specifically, we carry out a three step analysis. Firstly, we transform the dataset into centered log-ratio (clr) coordinates. Clr coordinates are easy to interpret but have the disadvantage that each observation adds up to zero, which in turn results in the fact that for example the covariance matrix is not invertible, something which is needed in BSS. Therefore, we transform the data into isometric log-ratio (ilr) coordinates by using pivot coordinates, which is only an orthogonal transformation of the clr data, this orthogonal matrix is usually denoted as contrast matrix ([10]). As ilr coordinates are indeed of full rank, all SBSS methods can be carried out in that space and the so-called combined loadings matrix consists of the matrix product of the contrast matrix and the estimated unmixing matrix. Interpretations of the results are carried out with the combined loadings matrix in terms of the clr coordinates. This procedure is detailed in [21], where the original SBSS is applied in the context of regional chemistry. The ilr transformation reduces the dimension of the data from to .

It was already shown in [17] that the norms of the difference between the global average and the measurements on-site are not constant throughout the spatial domain, which suggests to use LDiff matrices in favor of LCov matrices. In contrast to [17] where the method of Definition 4 is utilized, we analyze the data with the adapted whitening step seen in Definition 5. Specifically, we chose and to be ring kernel functions with the parameters and respectively.

Figure 8: First (left) and second (right) entry of the estimated latent random field with the method of Definition 5. Where, and are ring kernel functions with parameters and . Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.
clr(Al) -0.30 0.20 clr(Nb) 0.00 -0.02
clr(Ba) -0.06 -0.02 clr(P) 0.04 0.09
clr(Ca) 0.02 -0.08 clr(Si) 0.17 0.01
clr(Cr) 0.05 0.09 clr(Sr) -0.03 -0.06
clr(Fe) -0.04 -0.23 clr(Ti) -0.05 -0.07
clr(K) 0.11 -0.20 clr(V) 0.11 0.14
clr(Mg) -0.08 0.03 clr(Y) -0.10 -0.02
clr(Mn) -0.01 0.02 clr(Zn) 0.06 0.06
clr(Na) 0.01 0.12 clr(Zr) 0.08 -0.04
Table 1: Combined loadings for the first () and second () entry of the latent random field.

Figure 8 presents the first two entries of the estimated latent field and Table 1 shows the corresponding first two rows of the combined loadings matrix. The first entry is mainly driven by Aluminum and Silicium based on the high absolute values of the combined loadings. Therefore, in the glacial sediments of northern central Europe this indicates that Silicium is a dominant element, whereas Aluminum is more dominant in the Southern areas, which is confirmed by the original clr maps. For the second component it is interesting to see that the combined loadings for Aluminum and Vanadium as well as Iron and Potassium show roughly the same absolute values with different signs. Accordingly, Iron and Potassium are dominating in the geochemical composition of the soils in southern and eastern Spain, where the soils are formed differently than in the remaining part of the country and in Portugal. This dominance is also visible in southern Italy, in the Baltic countries and in parts of the Ukraine. In contrast, dominance of Aluminum-Vanadium is observed in soils along the Alps, in parts of the Balkan Peninsula and the Carpathian Mountains, but also in the western part of the UK and in Norway. A thorough interpretation of the results might be carried out by domains experts.

6 Discussion and Conclusions

In this paper we recalled SBSS, which is an unsupervised statistical tool that finds uncorrelated latent fields given a multivariate observed random field. This methodology offers one way to deal with multivariate spatial data in such a way that univariate methods can be used, a very appealing approach as multivariate spatial data is generally demanding to model. For the original SBSS method this useful feature was already investigated for interpolation tasks in

[19]. However, the original SBSS methods rely on the strong assumption that the drift is constant for the whole considered spatial domain, which is a very restrictive assumption. We introduced a new scatter matrix which measures second order spatial dependence based on differences, and argued that replacing local covariance matrices with local difference matrices yields great advantages when a drift is present in the data. We also adapted the whitening step of the SBSS methods by such a replacement, confirmed our outline in simulations and showed the usefulness of our new adaptations on a geochemical dataset derived from the GEMAS project.

The original motivation for the adapted whitening step seen in [3] is in the context of BSS for time series data with additive white noise. Such a model can also be considered for the spatial case, in which the drift would be replaced by a centered -variate white noise process with in Equation (1). This might be viewed as an external nugget effect. In such a case considering , ball or Gauss spatial kernel functions equals , but for a ring kernel function the additive term would disappear. Therefore, proper estimation of would be carried out with an adapted whitening step were only ring kernel functions are used. The problem here would be that local covariance functions with ring kernels are not necessarily positive definite, therefore a more sophisticated algorithm from [3] has to be adapted, which we plan for future research. Note that for this case LDiff matrices would be useless as they always carry on-sight covariance terms independent of the used spatial kernel function.


The work of CM and KN was supported by the Austrian Science Fund P31881-N32.


  • [1] Aitchison, J.: The Statistical Analysis of Compositional Data. Blackburn Press (2003)
  • [2] Bachoc, F., Genton, M.G., Nordhausen, K., Ruiz-Gazen, A., Virta, J.: Spatial blind source separation. Biometrika 107, 627–646 (2020)
  • [3] Belouchrani, A., Cichocki, A.: A robust whitening procedure in blind source separation context. Electronics Letters 36, 2050–2051 (2000)
  • [4] Cardoso, J.: Source separation using higher order moments. In: International Conference on Acoustics, Speech, and Signal Processing,, pp. 2109–2112 vol.4 (1989). DOI 10.1109/ICASSP.1989.266878
  • [5] Cardoso, J.F., Souloumiac, A.: Jacobi angles for simultaneous diagonalization. SIAM Journal on Matrix Analysis and Applications 17(1), 161–164 (1996). DOI 10.1137/S0895479893259546
  • [6] Chiles, J.P., Delfiner, P.: Geostatistics: modeling spatial uncertainty. Wiley, New York (1999)
  • [7] Cichocki, A., Amari, S.i.: Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications. John Wiley & Sons, Inc., USA (2002)
  • [8] Comon, P., Jutten, C.: Handbook of Blind Source Separation: Independent Component Analysis and Applications. Academic Press, Amsterdam (2010)
  • [9] Dobrić, V., Ojeda, F.M.: Fractional brownian fields, duality, and martingales. Lecture Notes-Monograph Series 51, 77–95 (2006)
  • [10] Filzmoser, P., Hron, K., Templ, M.: Applied Compositional Data Analysis. With Worked Examples in R. Springer International Publishing. Springer Nature Switzerland AG, Cham, Switzerland (2018). Springer Series in Statistics
  • [11] Georgiev, P., Cichocki, A.: Blind source separation via symmetric eigenvalue decomposition. In: Proceedings of the Sixth International Symposium on Signal Processing and its Applications (Cat.No.01EX467), vol. 1, pp. 17–20 (2001). DOI 10.1109/ISSPA.2001.949764
  • [12]

    Ilmonen, P., Nordhausen, K., Oja, H., Ollila, E.: A new performance index for ICA: Properties, computation and asymptotic analysis.

    In: V. Vigneron, V. Zarzoso, E. Moreau, R. Gribonval, E. Vincent (eds.) Latent Variable Analysis and Signal Separation, pp. 229–236. Springer (2010)
  • [13] Lietzén, N., Virta, J., Nordhausen, K., Ilmonen, P.: Minimum distance index for bss, generalization, interpretation and asymptotics. Austrian Journal of Statistics 49(4), 57–68 (2020). DOI 10.17713/ajs.v49i4.1130
  • [14] Luceño, A., Peña, D.: Autoregressive Integrated Moving Average (ARIMA) Modeling. American Cancer Society (2008). DOI
  • [15] Miettinen, J., Nordhausen, K., Taskinen, S.: Blind source separation based on joint diagonalization in R: The packages JADE and BSSasymp. Journal of Statistical Software 76(2), 1–31 (2017). DOI 10.18637/jss.v076.i02
  • [16] Miettinen, J., Taskinen, S., Nordhausen, K., Oja, H.: Fourth moments and independent component analysis. Statist. Sci. 30(3), 372–390 (2015). DOI 10.1214/15-STS520. URL
  • [17] Muehlmann, C., Filzmoser, P., Nordhausen, K.: Local difference matrices for spatial blind source separation. In: To appear in “Proceedings of the 3rd Conference of the Arabian Journal of Geosciences”. Springer (2020)
  • [18] Muehlmann, C., Nordhausen, K., Virta, J.: SpatialBSS: Blind Source Separation for Multivariate Spatial Data (2020). URL R package version 0.9-0
  • [19]

    Muehlmann, C., Nordhausen, K., Yi, M.: On cokriging, neural networks, and spatial blind source separation for multivariate spatial prediction.

    IEEE Geoscience and Remote Sensing Letters pp. 1–5 (2020). DOI 10.1109/LGRS.2020.3011549
  • [20] Nordhausen, K., Oja, H.: Independent component analysis: A statistical perspective. WIREs Computational Statistics 10(5), e1440 (2018). DOI
  • [21] Nordhausen, K., Oja, H., Filzmoser, P., Reimann, C.: Blind source separation for spatial compositional data. Mathematical Geosciences 47(7), 753–770 (2015)
  • [22] Nordhausen, K., Tyler, D.E.: A cautionary note on robust covariance plug-in methods. Biometrika 102(3), 573–588 (2015). DOI 10.1093/biomet/asv022
  • [23] Oja, H., Sirkiä, S., Eriksson, J.: Scatter matrices and independent component analysis. Austrian Journal of Statistics 35, 175–189 (2016). DOI 10.17713/ajs.v35i2&3.364
  • [24] Pan, Y., Matilainen, M., Taskinen, S., Nordhausen, K.: A review of second-order blind identification methods. WIREs Computational Statistics p. e1550 (2021). DOI
  • [25] R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2019). URL
  • [26] Reimann, C., Birke, M., Demetriades, A., Filzmoser, P., O’Connor, P. (eds.): Chemistry of Europe’s Agricultural Soils, Part A. Schweizerbart Science Publishers (2014)
  • [27] Schlather, M., Malinowski, A., Menck, P.J., Oesting, M., Strokorb, K.: Analysis, simulation and prediction of multivariate random fields with package RandomFields. Journal of Statistical Software 63(8), 1–25 (2015). URL
  • [28] Templ, M., Hron, K., Filzmoser, P.: robCompositions: an R-package for robust statistical analysis of compositional data. John Wiley and Sons (2011)
  • [29]

    Virta, J., Li, B., Nordhausen, K., Oja, H.: Independent component analysis for tensor-valued data.

    Journal of Multivariate Analysis

    162, 172–192 (2017).