1. Introduction
Models are commonly used in water resources research to investigate the dominant hydrological processes and the quantity and quality of water resources in welldefined surface or subsurface catchments. Various modeling approaches exist, ranging from blackbox models [33, 32, 27], i. e., transferring an input signal to a desired output signal, over lumped parameter models (greybox) [13, 36] to distributed processbased models [39, 15, 23, 41]. Given their ability to represent the physical characteristics of a catchment in detail, distributed processbased models are usually the first choice in water resources research. In the particular case of karst aquifers, however, acquiring the relevant data for these models is challenging due to the heterogeneous nature of karstic systems and their mostly unknown subsurface drainage systems [48]. For this reason, lumped processbased models are commonly accepted modeling approaches in karst water resources research [28, 26, 21]
. The parameters of such lumped modeling approaches are typically not directly measurable in the field and need to be estimated in the framework of model calibration
[20]. This leads to a decisive tradeoff: on the one hand, lumped models based on a low number of calibration parameters, e. g., 4 to 6, are less prone to model result nonuniqueness [25, 1], i. e., different parameter combinations lead to the same result. However, the representation of the dominant hydrological processes in karst systems may be too simple and not sufficiently represented by this low number of parameters [19]. In contrast, by including more calibration parameters to better represent relevant processes in the model structure, such as the effect of land use changes on spring discharges, the outputs may become nonunique, which can reduce the prediction accuracy of the model [18]. To tackle the challenge of applying lumped parameter models with a highdimensional parameter space for karst hydrological research studies, there is a need to perform comprehensive parameter studies to avoid model overparametrization and to reduce model parameter and output uncertainties.With the rise of computational power in the last two decades, Bayesian inverse problems have become a popular part of comprehensive parameter studies for complex realworld models (e. g., [29, 30, 43, 46]
). In contrast to classical inverse problems, whose formulation is often illposed, it allows for a wellposed and solvable formulation of parameter estimation problems and aims at finding a posterior distribution on the model parameters incorporating the information of measured, noisy data. The posterior, incorporating uncertainty in the parameters, is a conditional distribution and therefore is accessible through Bayes’ theorem. This formula consists of the product of the likelihood function and the prior, a distribution assumed on the parameters modeling prior knowledge of the physics and the model. A common objective is to construct samples from the posterior distribution which can be achieved by the application of
Markov chain Monte Carlo(MCMC) algorithms. MCMC methods, however, can be computationally expensive for highdimensional problems, since the Markov chain has to explore the whole space to find regions of high probability. There has been much effort in developing algorithms that are more efficient and reduce computational expense. One strategy is to use adaptive sampling methods as shown in e. g.,
[16, 47, 35]. Another approach is to reduce the effective dimension of the Bayesian inverse problem (e. g., [4, 6, 11, 12]). One technique for dimension reduction is the active subspace method presented in [10], [8], and [40], which seeks orthogonal directions that are dominant and drive the update from the prior to the posterior distribution. The subspace, spanned by the dominant directions, leads to a new coordinate system a lowdimensional Markov chain can move in. Chains in lower dimensions have preferable properties concerning autocorrelation times which makes them, as producers of posterior samples, more efficient. As a side effect, the active subspace can be exploited also to calculate global sensitivity metrics for individual parameters as shown in [9]. The active subspace method has previously been successful in reducing the effective dimension of parameter spaces for efficient Bayesian inversion, e. g., for a complex subsurface process [44].There has been little effort in karst hydrological research to investigate the lowdimensionality of a corresponding parameter estimation problem. Therefore, in this work we use the active subspace method to investigate the parameter relationships for a lumped karst aquifer model with a highdimensional parameter space. The model we use is the LuKARS model, which was recently developed by [3] to perform land use change impact studies in karstic environments. We hypothesize that it is possible to reduce the dimensions of the parameter space in LuKARS and, by that, to better constrain the parameter ranges of the most sensitive model parameters. Better constrained parameter ranges can reduce the model parameter and model result uncertainties.
This article is organized as follows. Section 2 provides a brief introduction into the study area and the structure of the LuKARS model. In Section 3, we explain Bayesian inversion, how we exploit active subspaces for it, the construction of global sensitivity values and the concrete setting for our application. The computational results are presented in Section 4, followed by a comprehensive discussion in Section 5 and a summary in Section 6.
2. Case study
2.1. Kerschbaum spring recharge area
The karst spring that we investigate in the present study is the Kerschbaum spring located about 10km south of the city of Waidhofen a.d. Ybbs (Austria) (Fig 1a and b). Its recharge area was delimited in a former study by [17] and comprises about 2.5 km. This prealpine catchment is part of the easternmost foothills of the Northern Calcareous Alps with the lowest elevation of 435 m at the Kerschbaum spring and a maximum elevation of 868 m on the summit of the mountain Glashüttenberg. The climate of the study area can be described as warmmoderate, with an annual mean temperature of 8 C and an annual mean precipitation of 1379 mm, both determined from daily measuring data recorded at the Hinterlug weather station between 1981 and 2014. As shown in Fig. 1b, forests represent the dominant land cover in the study area with beeches as primary tree species. Moreover, parts of the recharge area are used for dolomite mining.
From a geological point of view, the entire recharge area of the Kerschbaum spring is dominated by a lithologic sequence of Triassic dolostones (Fig. 1c). Apart from the absence of significant sinkholes in the regarded recharge area, leading to the fact that diffuse infiltration plays a key role for groundwater recharge, [17] also provided evidence for a deep karstified aquifer system with a wellconnected drainage system through fractures and conduits in the Kerschbaum spring aquifer. It is important to note that the Kerschbaum spring represents the most important source for the freshwater supply of the city and the surroundings of Waidhofen and is thus of particular interest for water resources research studies [3].
2.2. The LuKARS model
The LuKARS model was recently proposed by [3] with the aim to investigate the hydrological effects of land use changes in karst systems. LuKARS therefore considers the dominant hydrotopes in a defined recharge area, i. e., areas characterized by homogeneous soil and land cover properties, as distinct spatial units. The sum of the individual hydrotope responses to a given input signal (e. g., precipitation) plus the contribution of a shared linear baseflow storage is then the total spring discharge that should be modeled at a catchment’s outlet. The LuKARS model for the Kerschbaum spring in Waidhofen a.d. Ybbs was set up in [3] and includes four dominant hydrotopes in the considered recharge area. The hydrotopes were mapped and described during prior investigations by [31]. Their distribution in the study area is shown in Fig. 1d. Hydrotopes 13 have beeches as dominant tree species; however, they differ in terms of their individual soil characteristics and spatial shares. While the first hydrotope (denoted by Hyd 1) covers 13% of the recharge area and is characterized by shallow soils with mostly coarsegrained soil particles, hydrotope 3 (denoted by Hyd 3), in contrast, covers 27% of the catchment and is defined by deeper and fine textured soils. Hydrotope 2 (denoted by Hyd 2) has the largest spatial share in the Kerschbaum spring recharge area (56%) and represents a transition hydrotope between Hyd 1 and Hyd 3 with moderate soil thicknesses and coarse to finetextured soils. Hydrotope Q (denoted by Hyd Q) characterizes the dolomite quarries, which covered about 4% of space in the recharge area during the model period which is 20062008 in this study.
From a hydrological point of view, the areas of the dolomite quarries are drained by surface runoff and do not contribute to the Kerschbaum spring discharge. Thus, Hyd Q is excluded from model calibration and will not be mentioned hereafter. More details about the LuKARS model, i. e., a description of the equations used in LuKARS and the relevant parameters, are provided in Appendix A. In the following, we use an index to denote specifications for Hyd .
Each hydrotope is modeled as an independent bucket that has three different discharge components. The first, representing quickflow () occurring via preferential flow paths (e. g., conduits), is described by a nonlinear hysteresis function that is activated once a defined storage threshold () is reached and stops after the storage value falling below a predefined minimum storage value (). The second and third discharge components are both implemented by a linear discharge function and represent the discharge to a shared baseflow storage () as well as secondary spring discharge (), i. e., a discharge component that transfers water out of the catchment and does not contribute to the spring discharge. All together, seven parameters need to be calibrated for the implementation of each single hydrotope. These are the discharge parameter and the dimensionless exponent for , the storage thresholds for the quickflow activation () and (), parameter as the discharge coefficient of and, finally, and
as the discharge coefficient and the activation level for
, respectively. Given the different physical characteristics of all defined hydrotopes, the parameters of one hydrotope need to follow some constraints with respect to the parameters used for the implementation of other hydrotopes. From a practical point of view, this means that a hydrotope with shallow and coarsegrained soils (e. g., Hyd 1) needs to have a lower storage capacity and higher discharge coefficient as compared to a hydrotope with deep and finetextured soils (e. g., Hyd 3). For the particular case of the three hydrotopes in the Kerschbaum spring recharge area, the parameter constraints are given as follows:(1) 
[3] manually calibrated the LuKARS model for the Kerschbaum spring recharge area. Based on this trialanderror calibration, it was possible to reliably determine possible ranges of all model parameters. These are shown in Table 1 and will be used as prior parameter intervals for the presented study in a Bayesian setting.
No.  Parameter  Lower bound  Upper bound  Unit 

md  
mm  
mm  
–  
m mmd  
m mmd  
mm  
md  
mm  
mm  
–  
m mmd  
m mmd  
mm  
md  
mm  
mm  
–  
m mmd  
m mmd  
mm 
3. Parameter inference
3.1. Active subspaces
The active subspace method, introduced in [10], [8], and [40], identifies dominant directions in the domain of a multivariate, scalarvalued function . In other words, we seek directions on which varies more than on other directions, on average. Consider a function of the form for each , where , , is a rectangular matrix. Such functions are called ridge functions; see [38]
. Take a vector
from the null space of , i. e., , and compute(2) 
This equation shows that is constant along the null space of meaning that the dimensional function is actually intrinsically dimensional. In practice, the goal is relaxed to finding approximations and such that it holds that . Thus, we hope to find directions along which varies only slightly.
In following, let
be continuously differentiable such that its partial derivatives are squareintegrable with respect to a given probability density function
. We study a matrix which is the averaged outer product of the gradient of with itself, i. e.,(3) 
Note that is symmetric and positive semidefinite. For some vector , compute
(4) 
Thus, displays the averaged variation of along
. This quantity is maximized (in the set of unit vectors) by the normalized eigenvector
ofcorresponding to the largest eigenvalue
and gives(5) 
Since small eigenvalues mean a small variation in the direction of corresponding eigenvectors, this observation suggests to compute an orthogonal eigendecomposition , where contains the eigenvectors and contains corresponding eigenvalues in decreasing order. The symmetry of allows us to choose giving an orthonormal basis of . The eigenvalues and eigenvectors can be exploited to span a lowerdimensional space along which is dominantly varying. We can decide to split after the th column and to neglect the space spanned by , i. e.,
(6) 
such that and . We write
(7) 
where is called the active variable and the inactive variable. The span of , i. e., , is called the active subspace (of ). In other words, the coordinate system is transformed to a new orthogonal basis given by the eigenvectors. The new axes corresponding to the eigenvector with the largest eigenvalue is aligned to the direction of maximum averaged variation of in the original coordinate system.
The matrix is generally not available exactly in practice and must be approximated. [7], [24] and [34] proposed and analyzed a Monte Carlo approximation, i. e.,
(8) 
where , . The recommended number of samples
required to get a sufficiently accurate estimate of eigenvalues and eigenvectors is heuristically given by
(9) 
for a socalled sampling factor . The factor denotes the number of eigenvalues/eigenvectors to be estimated accurately. The heuristic is motivated in [7]
by results from random matrix theory in
[45]. Since we compute only an approximation/perturbation of the exact matrix , eigenvalues and eigenvectors are also only available in perturbed versions, i. e.,(10) 
Perturbed active and inactive variables are denoted by and , respectively.
We additionally need a function defined on the lowdimensional (perturbed) active subspace approximating as a ridge function, i. e.,
(11) 
for each . It is known that the best approximation in an sense is the conditional expectation conditioned on the active variable , i. e.,
(12) 
The conditional probability density function is defined in the usual way, see e. g., [2, Section 20 and 33].
In Section 4, we make use of a cheap response surface to gained by a polynomial regression approach since evaluating , or even a Monte Carlo approximation of it, can get costly due to additional evaluations of required. This surrogate is constructed according to instructions described in Algorithm 1.
Assume samples , according to and corresponding function values , are given.

Compute samples in the active subspace by
(13) 
Find a regression surface for pairs such that
(14) 
Get a lowdimensional approximation of at by computing
(15)
3.2. Global sensitivity analysis with active subspace
[9] show that it is possible to get global sensitivity values from the active subspace. Since the expensive computations for building the matrix are already done, no further huge computational costs are needed. By “global” we mean that the sensitivities, assigned to each parameter individually, are averaged quantities. In particular, the matrix from Eq. 3, which will be exploited to compute global sensitivities, is constructed with gradients of the function of interest at different locations weighted with a given probability density . For our application, the function and the density are taken to be the data misfit and the prior density from the Bayesian context described in Subsection 3.3.
The vector of sensitivities , in which the th component displays the (global) sensitivity of w.r.t. , is in [9] computed via
(16) 
Here, we will set . Thus, we can write more compactly
(17) 
where is the vector of eigenvalues and denotes pointwise multiplication. [9] show that the sensitivity values gained are comparable in practical situations with more familiar metrics like variancebased sensitivities, also known as Sobol indices, introduced in [42]. However, in theory it is possible to construct functions where the two metrics provide different results.
Similarly to the estimated quantities in previous sections, we will only have an estimate available due to the finite approximation of . In general, it is hard to give strict bounds for the number of samples required to get a sufficiently accurate approximation to . Hence, we use as many samples as were shown to be sufficient in [9].
3.3. Bayesian inversion
The aim of Bayesian inversion is to interrogate a posteriorprobability distribution on the space of parameters , , that incorporates uncertainty in the estimated parameters due to noise in the experimental data. [43] gives a rigorous mathematical framework for Bayesian inverse problems, even in infinitedimensional parameter spaces. The starting point in Bayesian inversion is a prior probability distribution that serves as a first guess on the distribution of the parameters without any incorporation of data. This choice is often driven by intuition or expert knowledge. Mathematically speaking, we seek a distribution on conditioned on the observation of specific measured data. This leads directly to the wellknown Bayes’ theorem.
Data , , is here modeled as
(18) 
where is additive Gaussian noise, modeling measurement errors, with mean zero and covariance matrix and is called the parametertoobservation map. This map is composed of a forward operator , displaying, e. g., the solution to a PDE, and an observation operator , being, e. g., a linear functional on the PDE solution space , i. e., . By Bayes’ theorem, we can define the posterior density as
(19) 
where is a normalizing constant to get a proper probability density with unit mass. The likelihood denotes the probability that a parameter is explaining the data corrupted by noise. In this context, i. e., assuming additive Gaussian noise, the likelihood is given by
(20) 
with the data misfit function and .
The posterior density is often intractable since its evaluation requires the solution of a potentially computationally intense problem hidden in the forward operator
. The situation becomes even worse if the inverse problem is stated in a highdimensional parameter space. A common way to interrogate an expensive posterior distribution is to construct samples distributed according to the posterior. However, many sampling techniques suffer from the curse of dimensionality. Wellknown sampling approaches comprise, but are by no means complete, e. g., Markov chain Monte Carlo (MCMC),
Sequential Monte Carlo (SMC), Importance Sampling and combinations of them.In this work, we use a MetropolisHastings algorithm from [22] which belongs to the class of MCMC methods. The algorithm constructs a discrete Markov chain whose components are taken as samples and are stationarily distributed according to the desired distribution which is the posterior here. The samples are naturally correlated which is a drawback compared to other sampling techniques that produce independent samples. However, advantages of this algorithm are the absence of restricting assumptions and the fact that it does not suffer from the curse of dimensionality as badly as other samplers. Nevertheless, MCMC methods can have deteriorating behavior in higher dimensions because the number of steps needed to get a sufficiently small correlation between two samples can be rather large. Since the forward operator is evaluated in every step in the MetropolisHastings algorithm, the standard usage of the algorithm can get computationally expensive, especially if is costly.
In this manuscript, we run a standard MetropolisHastings algorithm in low dimensions. For finding lowdimensional structure in our problem, we apply the active subspace method, described in the previous subsection, which allows to find dominant directions in a parameter space that drive the update from the prior to the posterior distribution in Bayesian inverse problems. Additionally, it provides a cheap surrogate of the data misfit function in the lowdimensional space.
3.4. MCMC in the active subspace
For Bayesian inversion, the function of interest that we aim to approximate with a lowdimensional approximation is the data misfit function from Eq. 20, i. e.,
(21) 
The gradient of needed for the computation of is
(22) 
where denotes the Jacobian matrix of the parametertoobservation operator .
Not only the perturbed active subspace, but also the cheap surrogate for , given by
(23) 
can be exploited for an accelerated MCMC algorithm in lower dimensions producing posterior samples for a Bayesian inverse problem as shown in [11]. By acceleration, we mean that the mixing behavior of the resulting Markov chains constructed by a standard MetropolisHastings algorithm can improve in lower dimensions. As a consequence, the computational effort to produce a certain amount of effective samples is reduced.
In a first step, we compute samples of the posterior distribution defined on the lowdimensional subspace, called active posterior samples. In order to evaluate the (approximate) posterior density in the active subspace given by
(24) 
where is the response surface approximating , we need an approximation to
denoting the marginal prior density on the perturbed active variable. The marginal prior density is in general not analytically available and has to be estimated, e. g., with kernel density estimation (KDE). Note that for the MetropolisHastings algorithm it is only important to know the density up to a constant. The algorithmic details are given in Algorithm
2.Assume a state is given at step . Let be a symmetric proposal density function and denote the surrogate on with . Furthermore, suppose a density estimating is given. Then, one step of the algorithm is:

Propose a candidate .

Compute the acceptance probability with
(25) 
Draw a uniform sample .

Accept/reject according to .
The active posterior samples are naturally correlated. Nevertheless, the autocorrelation time is much lower compared to higher dimensional Markov chains. The socalled effective sample size (ESS) displaying estimation quality of a sequence of samples can be computed by a formula from [5],
(26) 
The th component of a sample is denoted by and the number of samples available by . The expression describes the autocorrelation between the th component of samples with lag . The maximum lag regarded is given by . We determine the final effective sample size with
(27) 
Our final goal is to construct samples of the posterior distribution in the original dimensional space. Eq. 7 suggests to sample components for each gained from Algorithm 2. Since it is generally not trivial to sample from the conditional distribution of given , , we have to run another MetropolisHastings algorithm. Note that for this sampling, we only need to sample from because is proportional to that distribution. We again compute effective samples of given an effective sample to finally translate them to posterior samples in the original fulldimensional parameter space.
3.5. Parameter setting
Before we describe the results of the parameter study in the next section, we discuss the calibrated parameters and the notation. As mentioned in Section 2, there are three hydrotopes with 7 variable parameters each. These parameters are called physical parameters in the following. All of the 7 parameters have the same physical meaning for each hydrotope.
The dependence of the parameters, that is caused by the constraints in Eq. 1 and spread across all hydrotopes, needs to be circumvented since the application of active subspaces in Bayesian inverse problems prefers independently distributed and normalized parameters. This is one reason why we introduce artificial parameters which we want to call calibration parameters in the following. Another reason is that the values are calibrated on a log scale. Therefore, we define
(28) 
for each , . For parameters in hydrotopes 2 and 3, we write
(29) 
or
(30) 
depending on whether the parameter follows a decreasing or, respectively, increasing behavior (see Eq. 1). The fixed values and denote the lower and upper bound for each calibration parameter. The bounds are computed to match the bounds defined for the physical parameters given in Table 1. The parameters are the newly introduced calibration parameters. Additionally, we have
(31) 
completing the set of calibration parameters with , .
Finally, all calibration parameters need to be normalized, i. e., they are mapped from their corresponding interval to . The calibration parameters denoting a difference between values of two succeeding hydrotopes, i. e., parameters from above, are already normalized by construction. However, the normalized parameters from the first hydrotope are denoted by a bar over their name, e. g., by . Summarizing, the vector of all normalized calibration parameters is
(32)  
4. Results
In the following, we take Gaussian noise at a level of for the measured spring discharge which translates to , , where is the covariance matrix. The noise level of the experimental data was kindly provided by the water works owner of the city Waidhofen a.d. Ybbs.
For the computation of from Eq. 8, we use gradient samples of , although only about 250 samples would be necessary to estimate the first eigenvectors sufficiently accurately according to Eq. 9 with a pessimistic sampling factor . The reason for choosing this rather large number is to make sure that the global sensitivity values, for which such a heuristic does not exist, are also estimated accurately. The gradient was approximated by central finite differences. Using seven cores of type Intel(R) Xeon(R) E5 at 3 GHz each, the required forward runs need about 4.3 hours since it required 2.5 seconds for a single run of the model.
The resulting eigenvalues and first four eigenvectors are plotted in Fig. 2 and Fig. 3 a)d), respectively. Fig. 2 shows the spectral decay on a logarithmic scale. Gaps after the first and fourth eigenvalue suggest the existence of one and fourdimensional active subspaces. Fig. 3 a)d) shows eigenvectors with components colored according to the hydrotope they are supposed to model. It shows that parameters 5, 12, and 19, having large contributions in the first three eigenvectors, take a dominant role. All of these parameters involve the value for each hydrotope. Moreover, we can observe a ranking between the corresponding hydrotopes of these parameters, with decreasing order of the values from Hyd 2, Hyd 1 to Hyd 3 in the first eigenvector. A different pattern is observed for the contributions of parameters related to , represented by parameters 1, 8, and 15 for Hyd 1, Hyd 2, and Hyd 3, respectively. These parameters appear in eigenvectors 2 to 4 and show a different ranking as compared to the parameters, with a decreasing contribution from Hyd 1, Hyd 2 to Hyd 3. It is important to highlight that parameter 15 only shows a small contribution in eigenvector 3 and 4. A third important parameter group is related to , here represented by parameters 6, 13, and 20. They appear in eigenvectors 2 to 4 and have comparable contributions in Hyd 1 and Hyd 2 (parameters 6 and 13) but only a minor contribution in Hyd 4 (see eigenvector 4). Interestingly, for hydrotope 1 and 2 the same parameters show up with a similar shape in eigenvector .
It is important to emphasize that the resulting eigenvectors for the Bayesian data misfit would look very similar to ones gained from using the NashSutcliffe model efficiency coefficient (NSE), which is a more common misfit function in hydrology, introduced in [37]. This can be explained by the fact that the NSE is just a translated and scaled Bayesian data misfit function with a specific covariance matrix modeling noise. These differences do not change dominant directions.
With the eigenvalue/vector plot, we have already gained some insight in the parameter sensitivities. Fig. 3 e) shows the global sensitivities of the data misfit function normalized to . The most sensitive parameter is , but also and have their contributions since they show up in and , respectively, with nonnegligible corresponding eigenvalues. At the same time, parameters 6, 13, and 20, involving values in the hydrotopes, show sensitive in the first eigenvectors. Parameters displaying
values have small contributions but only in hydrotope 1 and 2. All other parameters do not show much sensitivity since their components contribute only to eigenvectors having eigenvalues that are orders of magnitudes smaller than the first four. As a consequence, we expect that the more sensitive parameters change their distribution (and also joint distributions) from the prior to the posterior during Bayesian inference.
For Bayesian inversion with the active subspace, we use the calibration parameters from Eq. 32
and assume a uniform distribution for them, i. e.,
. The prior intervals for the physical parameters are given in Table 1. We decide for a 4D subspace and compute a 4th order polynomial to get the response surface of by Algorithm 1. Since polynomials of 4th order already have degrees of freedom, we compute another 20,000 samples of following the prior to avoid overfitting. Nevertheless, the 1,000 samples from the computation of would have been enough to get the same score which is . This score, also called coefficient of determination, is a statistical measure for the goodness of a fit and reflects the percent of variance explained. If predictions of a regression match perfectly well with the data points, the score becomes 1. In contrast, it can become less than zero, if the predicted values are worse than choosing the constant mean value of the data. In this regard, our score indicates that our surrogate is a sufficiently well behaved fit.The MetropolisHastings algorithm described in Algorithm 2 is used to construct a Markov chain giving (correlated) posterior samples in the active subspace. Its proposal variance was adjusted to in order to give an acceptance rate of . The resulting distribution on the active variables, attained from the effective samples, is displayed in Fig. 4. Note that the scales of the upper and lower row of plots are quite different. This due to the fact that if we did not change the lower scale, the lower histograms would basically become thin lines displaying no information about the variance of the distribution. However, we see that the active variables are substantially informed which is exactly what we hoped to achieve. Also note that the first plot in the upper row, displaying the 1D marginal prior distribution of the first active variable , is almost a classical (rectangular) uniform distribution. This is caused by the large contribution of only one parameter in which is in this case. The more parameters contribute to an active variable, the more its marginal prior distribution differs from a rectangle, see for example.
The samples in the inactive subspace are computed as described in Subsection 3.4 and composed with active posterior samples to give posterior samples in the original space. Resulting 1D marginal statistics of the physical parameters are given in Table 2 (left). As expected, the calibration parameters with significant components in the active subspace are highly informed. The first calibration parameter () having a small but not negligible contribution according to the sensitivity values in Fig. 3 e) is already only mildly informed. The other parameters do not change or only very little because of the choice of the active subspace.
No.  Phys. par.  Mean  Std. 

Phys. par.  Cor. coef.  

0.89  
0.77  
0.70  
0.70  
0.66  
0.64  
0.63  
0.59  
0.57  
0.56  
0.52 
Left: Posterior means and standard deviations of physical parameters. The informed parameters are highlighted in bold. Right: Highest 2D correlations for physical parameters.
Additionally, Table 2 (right) displays the highest resulting twodimensional posterior correlation coefficients of the physical parameters in LuKARS. They are consistent with the components of corresponding calibration parameters that show up in the 4D active subspace. The largest correlation occurs between and . A large correlation coefficient of is also found between the respective storage values of Hyd 2 and Hyd 3, and , as well as between and .
As a first verification of the posterior’s value, we approximate its pushforward distribution through the parametertoobservation operator . We computed 1,000 samples of the distribution , where denotes the posterior distribution with Lebesgue density . Fig. 5
a) shows the 95% quantile band around the data with 5% additive Gaussian noise assumed together with a 75% quantile band around the median of the posterior’s pushforward distribution. More loosely speaking, the plot shows that around 75% of all random parameters drawn from the posterior will lie within the inherent uncertainty of the observed discharge. The uncertainty in the dynamics around the measured discharges in the Kerschbaum spring is matched well by the uncertainty of the pushforward posterior distribution which confirms the choice of a 4D subspace. Since we started with an uninformed prior (a uniform distribution), we can not expect to end up with a pushforward posterior much more certain than the uncertainty in the experiments. At this point, we would like to emphasize that it is possible to get a reasonably good approximation of the posterior by considering only 4 directions in the space of 21 parameters. In this manner, particularly regarding the low flow conditions and the recession limbs of the peak discharges, we can observe a variation of only up to
5l/s which is roughly the variation due to experimental noise. Also, the mean and the median of the pushforward distribution give results agreeing with the data which, in addition, supports the decision for a 4D subspace.Interestingly, assuming a noise level of 10% and taking only a 1D subspace also leads to usable results. However, note that the score of was only about in this case. Although we see that the assumptions are rather unrealistic and the approximation quality of is too bad, it is worth noting that already the first eigenvector contains some information about the posterior meaning that the mean and median of the corresponding pushforward posterior give reasonable discharge values in comparison with the data as shown in Fig. 5 b).
5. Discussion
This section is devoted to give a more hydrological interpretation to the results showed in the previous section. The observed spring discharge is modeled as the sum of the relative contribution of each hydrotope. Moreover, the LuKARS model of the Kerschbaum spring has fast responding buckets (i.e., hydrotopes that quickly deliver water to the karst spring after precipitation events, e.g., Hyd 1) and slow responding hydrotopes (i.e., hydrotopes, which slowly deliver water to the spring after precipitation events, e.g., Hyd 3).
Parameters 5, 12, and 19 show the largest contributions in the first 4 eigenvectors in Fig. 3 a)d). These parameters represent the physical parameters, which delimit the flow contributions from the hydrotopes to the linear baseflow storage. As derived in [3], the baseflow storage exhibits a relatively constant discharge behavior with a small temporal variability and its discharge coefficient was not changed within the presented research study. Since the outflow from the baseflow storage is controlled by its variable storage () and its constant discharge coefficient (), the hydrotope discharge coefficients for the groundwater recharge () also affect the baseflow discharge and its temporal dynamics since they control . Given that was not included as a calibration parameter, the parameters are responsible to maintain the baseflow contribution as derived by [3] and are most informed in the first eigenvector when applying the active subspace method.
Although parameters 5, 12, and 19 have the same physical interpretation, we can observe that they display different values for the different hydrotopes. This is due to the fact that different hydrotopes cover areas which are different in extension (Hyd 1  13%, Hyd 2  56% and Hyd 3  27%). Therefore, the interpretation of the most important parameters occurring in an eigenvector, should both consider the physical meaning of the parameter and the relative contribution of each single hydrotope to the total spring discharge, which is highly affected by the relative area covered by the hydrotope. In this specific case, parameter 12, associated with Hyd 2, displays the largest value since it covers the largest area in the Kerschbaum spring catchment and thus has a significant contribution to the total spring discharge. Parameter 5 has the second largest value although Hyd 1 ranks as third in terms of coverage area. This is explained by the fact that Hyd 1 provides the most dynamic and variable discharge behavior of all hydrotopes. Hence, the discharge contribution from Hyd 1 is essential to reproduce the discharge dynamics observed in the Kerschbaum spring. Hyd 3 has the smallest contribution in eigenvector 1, which can be explained by its more constant and less variable discharge behavior as compared to Hyd 1 and its smaller spatial share as compared to Hyd 2. Hence, although it has a larger area covered than Hyd 2, parameter 19 is less dominant than parameter 5.
Although parameters 1 and 8 (involving values of Hyd 1 and Hyd 2) do not show up in eigenvector 1, their contribution to eigenvectors 2 to 4 is worth discussing. These parameters follow a different ranking as compared to the parameters, suggesting a larger sensitivity of parameter 1 from Hyd 1 as compared to parameter 8 from Hyd 2. Since the parameters constrain the quickflow dynamics originating from each hydrotope, we argue that this ranking is the result of the different hydrological behaviors each hydrotope is supposed to simulate. Considering that Hyd 1, which shows the most dynamic behavior in response to precipitation or melt events, has a large contribution to the temporal variability of the discharge in the Kerschbaum spring, the importance of adequately representing the quickflow dynamics from Hyd 1 can be regarded as more important than the relative space covered by each hydrotope. These global sensitivity metrics were gained by the active subspace method. However, we want to stress that it provides more information than just sensitivities w.r.t. single parameters as typically common methods for sensitivity analysis do (e. g., Sobol indices). It finds directions in a highdimensional space along which a function is changing more, on average, than along other directions. For an inference process, this means that linear combinations of parameters, who get informed by data, are identified. This fact is exploited in the present study by the application of an adjusted MCMC algorithm in lower dimensions saving computational costs for sampling the Bayesian posterior distribution.
It is interesting to observe that the posterior means of the informed physical model parameters (, , see Table 2) are close to corresponding calibrated parameters found by [3]. Moreover, the standard deviations of the parameters in the posterior distribution are smaller as compared to the standard deviations found for the posterior distributions of all other physical parameter. This provides evidence that we can use the components of the first eigenvectors derived from the active subspace method to show which parameters get individually updated from the prior to the posterior distribution. Moreover, in comparison to the parameters obtained by manual calibration, we obtain an uncertainty related to any model parameter as well as a mathematicallysound choice of the parameters.
We argue that the high positive correlation coefficient between and results from the dependence of the overall model output on the quickflow dynamics of Hyd 1 [3]. The dynamics of the quickflow depend strongly on the difference between and (see Eq. 37) and thus results in a high correlation coefficient between both storage thresholds. The quickflow variations of Hyd 2 and Hyd 3 are less pronounced than in Hyd 1 resulting in a lower, but still considerable correlation coefficient between and as well as between and . The positive correlations between the discharge coefficients of Hyd 2, in particular between and , and as well as between and , highlight the strong interdependence of all discharge components that originate from Hyd 2. The strongest correlation (0.66) is between the discharge coefficient of the quickflow () and the discharge coefficient of the secondary spring discharge (). This means that, if we increase and not , the quickflow contribution increases disproportionately and the total simulated spring discharge would overestimate the observed peak discharges. The same relationship holds for the strong correlation (0.59) between the discharge coefficient of the quickflow () and the discharge parameter of the groundwater recharge () as well as between the and (0.52). These correlations confirm the fact that if we increase the discharge coefficient of one discharge component in a certain hydrotope, we need to simultaneously increase all other discharge coefficients in the same hydrotope to get a similar model output. If the other coefficients were not changed accordingly, we would disproportionately increase one discharge component (e. g., quickflow from Hyd 2) relative to others (e. g., or ); so, the hydrotope would show a different hydrological behavior. This highlights that the parameter dependencies within each hydrotope individually help to maintain the hydrological behavior that is typical for each hydrotope. The reason why only the discharge parameters of Hyd 2 show high correlation coefficients is that Hyd 2 covers more than 50% of the Kerschbaum spring recharge area and, thus, has the highest contributions to the total spring discharge. The correlations between various discharge coefficients of different hydrotopes, e. g., and , are interpreted as a consequence of the parameter constraints introduced in Eq. 1.
6. Summary
This manuscript shows results from a parameter study of a karst aquifer model for the Kerschbaum spring recharge area. The model uses 21 parameters to simulate the discharge behavior of the Kerschbaum karst spring in Waidhofen a.d. Ybbs. The study consists of a parameter inference in the Bayesian sense and a (global) sensitivity analysis. Since these problems have a nontrivial dimension, we first check for lowdimensional structure, if present, hidden in the inference process and exploit the socalled active subspace method for this. Additionally, without further expensive computations, we are then able to derive suitable sensitivity metrics.
It seems that the inference process is indeed intrinsically lowdimensional. Although the LuKARS model for the Kerschbaum spring has 21 calibration parameters, given the parameter constraints in Eq. 1, we find its dominant parameters and obtain wellconstrained values for them by means of Bayesian inversion in the identified active subspace. In particular, we decide to reduce the Bayesian inverse problem from a 21D to a 4D problem which is verified by showing that the pushforward distribution of the approximated posterior has a promising similarity with the uncertainty in the data. The 1D and 2D posterior statistics, which differ a lot from corresponding prior statistics for dominant parameters, are computed to quantify uncertainty in the inference caused by measurement errors in the data.
Eventually, the active subspace method shows again to be valuable for Bayesian inference and sensitivity analysis in complex highdimensional real world problems. The results are, however, rather useful from a computational perspective. The indepth validation of the model with further sensitivity analyses, more interesting from a hydrological perspective, and a discussion of the consequences for the community are out of scope and, hence, not part of this study, but will follow in future research.
Appendix A Model equations
In LuKARS, the following balance equation is solved for each individual hydrotope:
(33) 
is the water level [L] in hydrotope , [T] indicates the time and is a hydrotopespecific sink and source term in form of a mass balance of precipitation, snow melt, evapotranspiration and interception. [LT] summarizes all flow terms that do not contribute to the discharge at an investigated karst spring, i. e., secondary spring discharge and overland flow. [LT] represents the discharge from hydrotope to a linear baseflow storage, considered as groundwater recharge. [LT] is a hydrotopespecific quickflow component through preferential flow paths (e. g., subsurface conduits) with a direct connection to the spring outlet. [L] is the space covered by a respective hydrotope.
The following balance equation is solved for the baseflow storage:
(34) 
is the water level [L] in the baseflow storage and [LT] integrates the flows from all hydrotopes to the baseflow storage. [LT] indicates water flow from the storage B to the spring and simulates the matrix contribution from the saturated zone to the spring discharge. The variable [L] is the space of the entire recharge area. The discretized forms of 33 and 34, as shown in 35 and 36, are solved for each time step :
(35) 
(36) 
The discharge terms are computed as:
(37) 
(38) 
(39) 
(40) 
[L] and [L] are the upper and lower storage thresholds of hydrotope . [L] represents a hydrotopespecific activation level for . [LT] and [LT] are the specific discharge parameters for [LT] and [LT]. [LT] indicates the specific discharge parameter for the quickflow and [L] is the mean distance of hydrotope to the adjacent spring, allowing to account for the relative location and distribution of hydrotope in a specific recharge area. The ratio between and is the hydrotope discharge coefficient. Then, the dimensionless connectivity/activation indicator is defined as:
(41) 
(42) 
Acknowledgments
This collaborative research is a result of the UNMIX project (UNcertainties due to boundary conditions in predicting MIXing in groundwater), which is funded by the International Graduate School for Science and Engineering (IGSSE) of the Technical University of Munich. Additional financial support for BW, SM, and MTP was provided by the German Research Foundation (DFG, Project WO 671/111). GC also acknowledges the support of the Stiftungsfonds für Umweltökonomie und Nachhaltigkeit GmbH (SUN). The authors further thank the water works Waidhofen a.d. Ybbs for providing the relevant spatial and time series data. Data used to generate the figures and tables are available at https://bitbucket.org/mparente/2019karstunmixdata/.
References
 [1] Keith Beven. A manifesto for the equifinality thesis. Journal of Hydrology, 320(12):18–36, 2006.
 [2] Patrick Billingsley. Probability and Measure. John Wiley & Sons, 1995.
 [3] Daniel Bittner, Tahoora Sheikhy Narany, Bernhard Kohl, Markus Disse, and Gabriele Chiogna. Modeling the hydrological impact of land use change in a dolomitedominated karst system. Journal of Hydrology, 567:267–279, 2018.
 [4] Scott E Boyce and William WG Yeh. Parameterindependent model reduction of transient groundwater flow models: Application to inverse problems. Advances in water resources, 69:168–180, 2014.
 [5] Steve Brooks, Andrew Gelman, Galin Jones, and XiaoLi Meng. Handbook of Markov Chain Monte Carlo. CRC press, 2011.
 [6] T. BuiThanh and M. Girolami. Solving largescale PDEconstrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo. Inverse Problems, 30(11):114014, 23, 2014.
 [7] Paul Constantine and David Gleich. Computing active subspaces with Monte Carlo. arXiv preprint arXiv:1408.0545, 2014.
 [8] Paul G. Constantine. Active Subspaces, volume 2 of SIAM Spotlights. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2015. Emerging Ideas for Dimension Reduction in Parameter Studies.
 [9] Paul G Constantine and Paul Diaz. Global sensitivity metrics from active subspaces. Reliability Engineering & System Safety, 162:1–13, 2017.
 [10] Paul G. Constantine, Eric Dow, and Qiqi Wang. Active subspace methods in theory and practice: applications to kriging surfaces. SIAM J. Sci. Comput., 36(4):A1500–A1524, 2014.
 [11] Paul G. Constantine, Carson Kent, and Tan BuiThanh. Accelerating Markov chain Monte Carlo with Active Subspaces. SIAM J. Sci. Comput., 38(5):A2779–A2805, 2016.
 [12] Tiangang Cui, Kody J. H. Law, and Youssef M. Marzouk. Dimensionindependent likelihoodinformed MCMC. J. Comput. Phys., 304:109–137, 2016.
 [13] Perrine Fleury, Bernard Ladouche, Yann Conroux, Hervé Jourde, and Nathalie Dörfliger. Modelling the hydrologic functions of a karst aquifer under active water management–the lez spring. Journal of Hydrology, 365(34):235–243, 2009.
 [14] GBA. Geologische Bundesländerkarten, Geologische Bundesanstalt Österreich. https://gisgba.geologie.ac.at/arcgis/services/image/AT_GBA_GK100_200/ImageServer/WMSServer?request=GetCapabilities&, 2018. April 18th, 2018.
 [15] M Giese, T Reimann, V BaillyComte, JC Maréchal, M Sauter, and T Geyer. Turbulent and laminar flow in karst conduits under unsteady flow conditions: Interpretation of pumping tests by discrete conduitcontinuum modeling. Water Resources Research, 54(3):1918–1933, 2018.
 [16] Heikki Haario, Marko Laine, Antonietta Mira, and Eero Saksman. DRAM: Efficient adaptive MCMC. Stat. Comput., 16(4):339–354, 2006.
 [17] P. Hacker. Hydrologischhydrogeologische Untersuchungen im Bereich des Glashüttenberges zur Frage des engeren Schutzgebietes für die KerschbaumerQuelle. ARC Seibersdorf research GmbH, 2003.
 [18] A Hartmann, N Goldscheider, T Wagener, J Lange, and M Weiler. Karst water resources in a changing world: Review of hydrological modeling approaches. Reviews of Geophysics, 52(3):218–242, 2014.
 [19] A Hartmann, T Wagener, A Rimmer, J Lange, H Brielmann, and M Weiler. Testing the realism of model structures to identify karst system processes using water quality and quantity signatures. Water Resources Research, 49(6):3345–3358, 2013.
 [20] Andreas Hartmann, Juan Antonio Barberá, and Bartolomé Andreo. On the value of water quality observations for karst model parameterization. Hydrol. Earth Syst. Sci., 21:5971–5985, 2017.
 [21] Andreas Hartmann, Markus Weiler, Thorsten Wagener, Jens Lange, M Kralik, F Humer, N Mizyed, A Rimmer, JA Barberá, B Andreo, et al. Processbased karst modelling to relate hydrodynamic and hydrochemical characteristics to system properties. Hydrology and earth system sciences, 17(8):3305–3321, 2013.
 [22] W Keith Hastings. Monte Carlo sampling methods using Markov chains and their applications. 1970.
 [23] Wesley R. Henson, Rob de Rooij, and Wendy Graham. What makes a firstmagnitude spring?  global sensitivity analysis of a speleogenesis model to gain insight into karst network and spring genesis. Water Resources Research, 2018.
 [24] John T Holodnak, Ilse CF Ipsen, and Ralph C Smith. A Probabilistic Subspace Bound with Application to Active Subspaces. arXiv preprint arXiv:1801.00682, 2018.
 [25] AJ Jakeman and GM Hornberger. How much complexity is warranted in a rainfallrunoff model? Water resources research, 29(8):2637–2649, 1993.
 [26] Hervé Jourde, Naomi Mazzilli, Nicolas Lecoq, Bruno Arfib, and Dominique Bertin. Karstmod: A generic modular reservoir model dedicated to spring discharge modeling and hydrodynamic analysis in karst. In Hydrogeological and Environmental Investigations in Karst Systems, pages 339–344. Springer, 2015.
 [27] Damir Jukić and Vesna DenićJukić. Estimating parameters of groundwater recharge model in frequency domain: Karst springs jadro and žrnovnica. Hydrological Processes: An International Journal, 22(23):4532–4542, 2008.
 [28] Damir Jukić and Vesna DenićJukić. Groundwater balance estimation in karst by using a conceptual rainfall–runoff model. Journal of Hydrology, 373(34):302–315, 2009.
 [29] Jari Kaipio and Erkki Somersalo. Statistical and Computational Inverse Problems, volume 160. Springer Science & Business Media, 2006.
 [30] Peter K Kitanidis and Jonghyun Lee. Principal component geostatistical approach for largedimensional inverse problems. Water resources research, 50(7):5428–5443, 2014.
 [31] R. Koeck and E. Hochbichler. Das WaldHydrotopmodell als WSMSWerkzeug im Quellenschongebiet der Stadt Waidhofen/ybbs. Report in the course of the CCWaterS project, 2012.
 [32] D Labat, R Ababou, and A Mangin. Linear and nonlinear input/output models for karstic springflow and flood prediction at different time scales. Stochastic environmental research and risk assessment, 13(5):337–364, 1999.
 [33] David Labat, Rachid Ababou, and A Mangin. Rainfall–runoff relations for karstic springs. part i: convolution and spectral analyses. Journal of Hydrology, 238(34):123–148, 2000.
 [34] Rémi Lam, Olivier Zahm, Youssef Marzouk, and Karen Willcox. Multifidelity Dimension Reduction via Active Subspaces. arXiv preprint arXiv:1809.05567, 2018.
 [35] Steven A Mattis and Barbara Wohlmuth. Goaloriented adaptive surrogate construction for stochastic inversion. Computer Methods in Applied Mechanics and Engineering, 339:36–60, 2018.
 [36] Naomi Mazzilli, Vincent Guinot, Hervé Jourde, Nicolas Lecoq, David Labat, B Arfib, Cécile Baudement, Charles Danquigny, Lucie Dal Soglio, and Dominique Bertin. Karstmod: a modelling platform for rainfalldischarge analysis and modelling dedicated to karst systems. Environmental Modelling & Software, 2017.
 [37] J Eamonn Nash and Jonh V Sutcliffe. River flow forecasting through conceptual models part i—a discussion of principles. Journal of Hydrology, 10(3):282–290, 1970.
 [38] Allan Pinkus. Ridge functions, volume 205. Cambridge University Press, 2015.
 [39] Thomas Reimann and Melissa E Hill. Modflowcfp: A new conduit flow process for modflow–2005. Groundwater, 47(3):321–325, 2009.
 [40] Trent Michael Russi. Uncertainty Quantification with Experimental Data and Complex System Models. PhD thesis, University of California, Berkeley, 2010.
 [41] Martin Sauter, Tobias Geyer, Attila Kovács, and Georg Teutsch. Modellierung der Hydraulik von Karstgrundwasserleitern–eine Übersicht. Grundwasser, 11(3):143–156, 2006.
 [42] Ilya M Sobol. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Mathematics and computers in simulation, 55(13):271–280, 2001.
 [43] A. M. Stuart. Inverse problems: A Bayesian perspective. Acta Numerica, 19:451–559, 2010.
 [44] Mario Teixeira Parente, Steven Mattis, Shubhangi Gupta, Christian Deusner, and Barbara Wohlmuth. Efficient parameter estimation for a methane hydrate model with active subspaces. Computational Geosciences, 2018.
 [45] Joel A Tropp. Userfriendly tail bounds for sums of random matrices. Foundations of computational mathematics, 12(4):389–434, 2012.
 [46] Jasper A Vrugt, Philip H Stauffer, Th Wöhling, Bruce A Robinson, and Velimir V Vesselinov. Inverse modeling of subsurface flow and transport properties: A review with new developments. Vadose Zone Journal, 7(2):843–864, 2008.
 [47] Jasper A Vrugt, CJF Ter Braak, CGH Diks, Bruce A Robinson, James M Hyman, and Dave Higdon. Accelerating Markov chain Monte Carlo simulation by differential evolution with selfadaptive randomized subspace sampling. International Journal of Nonlinear Sciences and Numerical Simulation, 10(3):273–290, 2009.
 [48] Zexuan Xu, Nicolas Massei, Ingrid Padilla, Andrew Hartmann, and Bill Hu. Characterization, modeling, and remediation of karst in a changing environment. Environmental earth sciences, 77(12):476, 2018.