DeepAI

Analysis of sloppiness in model simulations: unveiling parameter uncertainty when mathematical models are fitted to data

This work introduces a Bayesian approach to assess the sensitivity of model outputs to changes in parameter values, constrained by the combination of prior beliefs and data. This novel approach identifies stiff parameter combinations that strongly affect the quality of the model-data fit while simultaneously revealing which of these key parameter combinations are informed primarily from the data or are also substantively influenced by the priors. We focus on the very common context in complex systems where the amount and quality of data are low compared to the number of model parameters to be collectively estimated, and showcase the benefits of our technique for applications in biochemistry, ecology, and cardiac electrophysiology. We also show how stiff parameter combinations, once identified, uncover controlling mechanisms underlying the system being modeled and inform which of the model parameters need to be prioritized in future experiments for improved parameter inference from collective model-data fitting.

• 2 publications
• 2 publications
• 36 publications
• 5 publications
• 1 publication
• 3 publications
• 2 publications
• 39 publications
• 1 publication
• 4 publications
04/12/2022

Strategic model reduction by analysing model sloppiness: a case study in coral calcification

It can be difficult to identify ways to reduce the complexity of large m...
09/16/2020

Parameter inference for a stochastic kinetic model of expanded polyglutamine proteins

The presence of protein aggregates in cells is a known feature of many h...
10/13/2021

On the Parameter Combinations That Matter and on Those That do Not

We present a data-driven approach to characterizing nonidentifiability o...
06/04/2022

Combining the Morris Method and Multiple Error Metrics to Assess Aquifer Characteristics and Recharge in the Lower Ticino Basin, in Italy

Groundwater flow model accuracy is often limited by the uncertainty in m...
01/29/2020

Reducing complexity and unidentifiability when modelling human atrial cells

Mathematical models of a cellular action potential in cardiac modelling ...
04/30/2021

Interpretability of Epidemiological Models : The Curse of Non-Identifiability

Interpretability of epidemiological models is a key consideration, espec...
12/02/2016

Inferring Cognitive Models from Data using Approximate Bayesian Computation

An important problem for HCI researchers is to estimate the parameter va...

Introduction

A single biological cell is itself a complex system, as is an organism made up of such cells, as is an ecosystem of those organisms interacting with one another. Despite the diversity of systems composing our world, many of these share similar structural and functional features that can be unraveled through computer simulation [1, 2, 3, 4]. As a result, modeling and simulation has become increasingly important to understand and predict underlying behavior of systems across all scales [3, 5, 6], including molecules [7], cells [8, 9], engineered processes [10], through to astrophysical phenomena [11]. Continuous advances in model descriptions of reality together with model calibration to experimental data have improved the fidelity of computer experiments and made them much more predictive [2, 1]. However, the cost of this fidelity is an increase in the number of model parameters [12, 5], and a greater risk that these parameters are difficult or impossible to be uniquely identified [13, 14, 15, 3]. Unsurprisingly, a significant amount of uncertainty in parameter values often remains after even very successful model-data calibration [16, 17, 18].

Sensitivity analysis and uncertainty quantification comprise a whole field dedicated to learning about how model behavior is controlled by their parameters [19, 20, 21]. These techniques can be used to assess the sensitivity of the model-data fit to changes in parameter values either in a local sense, around a single point (i.e., the set of best-fit parameter values), or in a global sense, across all plausible parameter values consistent with the available data [16, 13, 22]

. An alternative approach is Bayesian inference

[23, 24]; an increasingly used modelling technique that accounts for collective parameter uncertainty constrained by the combination of both data and prior beliefs [6, 18, 8, 25, 26, 9]. However, regardless of the approach taken to characterize the effects of changes in parameter values on model outputs, critical model parameters are often considered as individuals in terms of their impact on the model behavior [20]. Sensitivity analysis typically considers the derivative of model outputs with respect to the parameters [16, 19, 22] while a Bayesian posterior is analyzed predominantly in terms of its marginal distributions [18, 8]. When combinations of model parameters are considered, it is largely in terms of crude numerical scores [21, 20, 22]. Unfortunately, model parameters that are not very constrained by the data are often assumed not to have a strong influence on model predictions, even though it is the case of many systems that certain combinations of seemingly unconstrained model parameters are more narrowly constrained by the data than any of the individual model parameters [15, 14, 17, 13].

In fact, model parameters can act together or against each other, and often must be understood in terms of their combinations [16, 15]. Parameter combinations that significantly influence model predictions, called stiff eigenparameters, essentially act as emergent “control knobs” for the model: predictions are possible without precise knowledge of individual parameter values as it is these stiff eigenparameters that are tightly constrained by the data [17, 27]. Conversely, the model-data fit may be also relatively insensitive to some other parameter combinations, called sloppy eigenparameters [14, 15], which hence are poorly constrained by the data. [28, 29, 17]. Recently, efforts have been made to unravel these connections among parameters through the expanding literature on model sloppiness [14, 30, 31, 32, 33]. Methods to analyze model sloppiness seek to expose the sensitivities of the model-data fit to changes in sets of parameter values by characterizing the topography of the surface describing how the model-data fit depends on the model parameters in the vicinity of the best-fit parameter values [16, 17, 34]. However, thus far such methods have primarily focused on the field of systems biology where there is little prior knowledge of parameter values [13, 29, 27], and so the sensitivities of the model-data fit to changes in parameter values remain to be considered in the context where prior information is also available (e.g., from experts or previous studies) to inform parameter values [13, 35, 36, 37, 38].

In this work, we propose a new comprehensive approach to characterize local and global sensitivities of the model-data fit to changes in parameter values. This is achieved by bringing a Bayesian inference perspective [23, 24] to the analysis of sloppiness that consequently leads to the robust identification of the stiff eigenparameters. In this way, analysis of sloppiness gains the ability to incorporate prior information and to look beyond the curvature at a single point (i.e., the set of best-fit parameter values) in an uncertainty-informed way. Meanwhile, Bayesian inference gains a tool to identify well-constrained combinations of parameters that can be otherwise hidden when considering the uncertainty in individual model parameters, critical when the number of parameters to be estimated is large.

As part of our comprehensive approach, we introduce two Bayesian definitions for the sensitivity matrix that underlies the analysis of model sloppiness, calculated using the posterior samples generated by Bayesian inference. The first definition uses the covariance of the posterior samples to inform parameter space curvature in a non-localized manner [14, 39], with ties to traditional dimensionality reduction [40]. This approach has appeared in works analyzing model sloppiness but only in the context of uninformative priors [14, 16, 41, 34]. Considering it here in the Bayesian context with informative priors, we identify the need for our second approach, that uses the dimension reduction idea from Cui et al. [42] to conveniently separate the effect of any prior information from that of the data.

Using this novel approach to analyzing sloppiness, we illustrate how to identify the combinations of parameters driving model behavior in applications beyond systems biology, and in a manner that acknowledges separately the available information (e.g., via expert knowledge [36]). First, we highlight the advantages of our approach using the well-known Michaelis–Menten model of enzyme kinetics [43]. We then apply it to a well-studied ecosystem network from Australia (a relatively data-poor system) [44], and a model for the action potential of heart cells (characterized by complex dynamical behavior) [45]. In these latter two applications, different aspects of the interaction between model and data are revealed by the analysis of sloppiness that are otherwise hidden by the individual techniques we bring together here. Finally, we illustrate how stiff eigenparameters, once identified, can be used to design future experiments to improve parameter inference from collective model-data fittings and identify controlling mechanisms underlying the systems being modeled.

Results

Motivating example: the Michaelis–Menten kinetics

Critical parameter combinations are readily identified by analysis of sloppiness

The ubiquitous Michaelis–Menten model of biochemistry [43] is a perfect example to demonstrate both the benefits of understanding parameter dependence through the lens of model sloppiness and of bringing a Bayesian approach to the topic. This model describes the dependence of an enzyme-catalyzed reaction rate on substrate concentration as [46, 47]

 υ=kcat[ET][S]KM+[S]=kcat[ET]1+KM/[S], (1)

where parameters and together dictate the maximum rate of reaction (), while controls the substrate concentrations at which saturation effects become significant [48].

From the right-hand side of Eq. 1, it is already clear that there are two rate-limiting regimes, one in which the reaction rate simplifies to zero order kinetics with respect to substrate at high , and the other one in which the reaction rate simplifies to first order kinetics at low  [43, 47]. To illustrate our methods, we thus consider two noisy synthetic datasets representing these two well-known rate-limiting regimes – the first dataset (A) consists of five measurements obtained beyond the saturation point, while the second dataset (B) consists of five measurements obtained before saturation has any apparent impact on the model behavior (Fig. 1). Both datasets fail to describe the full behavior represented by Eq. 1, and thus suitably highlight the well-known parameter identifiability issues in this model [29, 48].

In dataset A, measurements only inform the reaction rate at saturation, , and so nothing can be learned about parameter . While this tendency could also be identified by traditional sensitivity analysis [19]

or by inspecting the posterior variance for this parameter obtained from Bayesian inference

[49, 48], approaches for model sloppiness go a step further. By identifying key directions

in the space of the log-parameters, as encoded by the eigenvectors and eigenvalues of a sensitivity matrix (see Methods), model sloppiness identifies that dataset A only informs the

product of the remaining two parameters in Eq. 1, . Regardless of whether a traditional definition (matrices or , see Methods) or any of our new Bayesian definitions (matrices and ) of the sensitivity matrix is taken, a single eigenvalue dominates, with parameter combination denoted being the corresponding eigenparameter (Table 1, Scenario 1). This is not however visible in the individual parameter marginals when Bayesian inference is used to calibrate the model to data, even in this simple problem (Fig. S1).

Analogously, model sloppiness successfully identifies the parameter combination governing the rate of reaction in the non-saturating regime (Fig. 1, Dataset B). Given that this dataset is taken at low substrate concentration (), Eq. (1) reduces to a linear dependence and coefficient is the dominant eigenparameter (Table 1, Scenario 3), which uncovers the nature of the poor parameter identifiability in this model. However, in this scenario as well as the second scenario for dataset A (Table 1), we choose informative priors that cause the Bayesian approaches to model sloppiness (matrices and ) to lead to different dominant eigenparameters. We explore the information provided by these approaches, that take into account both prior and data to inform model parameters in the following section.

A Bayesian perspective reveals whether stiff parameter combinations are informed by the data or are influenced by the prior

Often, values for model parameters are meaningfully constrained by known feasible ranges or by expert information [35, 38, 36], which can potentially change both the most plausible set of values for the parameters, and the nature of the new information provided by the data. To demonstrate how our Bayesian approach to analyzing model sloppiness addresses this, we consider different scenarios where the reaction rate data (Fig. 1) is now coupled with prior information, and thus highlight how the stiffest eigenparameters obtained using our two new definitions of sensitivity matrix (matrices and ) together reveal whether values of model parameters are informed by the data or are influenced by the prior. We first fit Eq. 1

to dataset A, considering a multivariate log-normal distribution for the model parameters that sets the value of one parameter (

) far away from its reference value (Fig. S2). As a result, the posterior correctly concentrates around the reference parameter values used to generate the data (Fig. 1A, first and second panel), except for the poorly-specified parameter () for which the prior renders it unable to (Fig. 1A, third panel). Here, prior and posterior distributions for parameter are approximately equivalent (overlapping), thus reflecting that the data collected at saturation is uninformative to the value of this parameter. However, by examining the curvature of the posterior via its inverse covariance matrix , this parameter emerges as the stiffest eigenparameter (Table 1, Scenario 2). Thus, as prior and posterior distributions for parameter are overlapped (Fig. 1A, third panel), this method reveals that the information already contained in the prior is dominating that provided by the data.

To learn the data informativity on model parameters while simultaneously acknowledging any prior information, we use the likelihood-informed subspace (LIS) method. This approach works by transforming the effects of the prior on the curvature of parameter space [42, 50], leaving only the effects of the data via the likelihood (further details in the Methods). By doing so, the LIS method produces a sensitivity matrix () that identifies the region in parameter space where the informativity of the data prevails over that of the prior information [42, 50]. For example, by imposing an informative prior for parameter in this scenario, the method (matrix ) recognizes that no additional information is gained about this parameter from dataset A through the model-data fitting process, and so it returns the same dominant eigenparameter  (Fig. 1A, fourth panel) as the methods (matrix or ) that ignore the prior altogether (Table 1, Scenario 2). A natural question is then what does the LIS method provide that is not already given by a standard analysis of sloppiness? The key benefit is that if prior information does change the most plausible (prior-informed) region of parameter space, and the model behaves differently in this region, the LIS method will identify the directions in parameter space where the data are most “informative” relative to the prior, as we discuss next.

In Scenario 3 (Table 1), we fit Eq. 1 to dataset B considering a combination of uniform and log-normal prior distributions that strongly specify values of parameters and well and badly (Fig. S3), respectively. Given that dataset B only constrains the value of combination of parameters  (Fig. 1B, fourth panel), the extreme values of the parameters selected by unconstrained least-squares fitting (Fig. 1B, MLE in the second and third panel) highlight the importance of specifying plausible ranges for parameters via a Bayesian prior. As for Bayesian inference, the posterior distribution simply fixes the value of the parameter  (Fig. 1B, first panel) to a value that constrains well eigenparameter  (Fig. 1B, fourth panel). Similar to Scenario 2, model sloppiness, as implied by the posterior covariance method (matrix ), selects one of the parameters strongly specified by the prior,  (Fig. 1B, third panel), as the stiffest eigenparameter (Table 1). In this scenario, the LIS method (matrix ) instead identifies that dataset B acts only to fix the value of parameter and selects it as the dominant eigenparameter. That is, in contrast to the standard analysis of sloppiness only considering the likelihood surface, the LIS method uncovers what new information is provided by the data when there is prior parameter knowledge. Thus, our Bayesian methods together clarify whether the model parameters (or eigenparameters) are informed by the data or are significantly influenced by the prior beliefs.

Case study 1: Ecosystem network

A global perspective to analyzing sloppiness reveals true informativity of the data

Unlike the simple motivating example considering two well-known rate-limiting regimes that readily unveiled the controlling eigenparameters (Fig. 2, fourth panels), with much larger models, inferring the parameter combinations that are more or less sensitive to the model-data fit can be difficult from a simple model inspection. To illustrate this, as a more complex case study from ecology, we use a well-known four-species ecosystem network model [44] that includes two threat species (foxes and rabbits), one threatened species (native mammals), and a basal species (pasture), as depicted in Fig. 2A

. This ecosystem model consists of four discrete-time equations (based on ordinary differential equations) and eight constitutive equations (Table S1) whose twenty parameter point estimates (Table S2) were inferred from several studies at two semi-arid locations in Australia

[44, 51]. Here, we thus seek to illustrate key benefits of our Bayesian analysis of sloppiness for data-poor systems, characterized by low quality and amount of observed data due to practical limitations [18, 1, 14, 2]. To do so, we first calibrate the ecosystem network model (Table S1) to noisy synthetic time-series data via least-squares minimization and also via Bayesian inference (Fig. 2B), with the latter model-data fitting technique considering a multivariate log-normal prior distribution for the model parameters (Fig. S5). After the model-data calibration, model predictions (Fig. 2B) based on a model ensemble (shaded regions), considering all plausible parameter values (Fig. S5), enclose both the simulated noisy data ( symbols) and true ecosystem dynamic behavior (dashed profiles). They also enclose predictions based on two sets of best-fit parameter values (dotted profiles) obtained from starting the least-squares optimization at two different initial parameter values. Further, parameter marginals (Fig. S5) enclose these two separate point estimates and also illustrate that most of the model parameters are poorly constrained by the data.

In addition to quantifying parameter uncertainty, a global perspective to the problem of fitting models to data can benefit the inference of critical parameter combinations that control the quality of the model-data fit. For example, while local changes in the topography of the surface described by the likelihood function in the vicinity of the two sets of best-fit parameter values (Fig. S5) mislead inference of stiff eigenparameters through the standard analysis of sloppiness (cf.  in Table 2 from matrices or , evaluated at the different sets of best-fit values and ), our Bayesian methods (matrices and ) fully characterize the structure of this surface by considering all plausible parameter values, informed by the combination of both data and prior beliefs. In this way, differences between dominant eigenparameters from Bayesian sensitivity matrices and  (Table 2) also demonstrate that the prior is influencing the most plausible region of parameter space, which thus implies that the surface described by the posterior distribution (Eq. 9) and the likelihood function (Eq. 15) are different locally and globally.

Analysis of sloppiness brings new insights to Bayesian parameter inference

Combining model sloppiness together with Bayesian inference reveals critical parameter combinations that can be otherwise lost when only considering the uncertainty in individual model parameters through Bayesian inference. After the fit of the ecological model to data, for example, parameter marginals (Fig. S5) illustrate that only a few of the model parameters (, , and ) are well-constrained by the data, which suggests that these parameters have a strong influence on the quality of model-data fit. Instead, our Bayesian analysis of sloppiness (matrices and ) identifies that it is in fact combinations of parameters , , , , and that are the most constrained by the available data (Fig. 4). The prior distribution appears to be weakly informing the three stiffest eigenparameters , and  (Table 2) since the first eigenparameter from the posterior covariance method (matrix ) also corresponds to the third eigenparameter from the LIS method (matrix ), while the quotient () and product () of the second and third eigenparameters and from the posterior covariance method (matrix ) approximate the first and second eigenparameters and from the LIS method (matrix ), respectively.

For this system, the identified stiff eigenparameters (Fig. 4) do not appear together in single terms within the model (Table S1). However, parameter ratios (or ) with arise as part of the dominant eigenparameters (Table 2) as they appear in separate terms with opposite sign in this model (Table S1). As a result, there is a compensation effect between values of parameters and that has two key implications for the model predictions. Firstly, the model-data fit is highly informative for characterizing growth dynamics (, , ) of rabbits (), threatened mammals () and foxes (), which is likely to significantly affect animal species abundances (, and ). Secondly, analysis of sloppiness reveals that by measuring either the maximum rate of decrease () or increase () of the threatened mammal density (also applies for rabbits and foxes), collective model-data fit will inform values of the other parameter to a similar extent, as we discuss in the next section.

Bayesian analysis of sloppiness readily informs future experimental design

Our Bayesian analysis of sloppiness unveils hidden parameter interdependencies that can help design future experiments for improved parameter inference. For example, given that the posterior covariance method (matrix ) reveals that the ratio of parameters is the stiffest eigenparameter (Table 2), this ratio also indicates that parameters and are approximately linearly related,  (Fig. 4A). Here, an analogous tendency is seen for the stiffest eigenparameter from the LIS method (matrix ),  (Table 2), with parameter and combination of parameters being instead inversely related,  (Fig. 4B). Additionally, many samples from the posterior distributions are seen to lead to the same value of the log-likelihood function (no apparent color change across posterior distribution samples in Fig. 5), with the two sets of best-fit parameter values ( symbols) and reference (true) values ( symbols) falling within the corresponding posterior distribution sample. This tendency indicates that every value for the model parameter (or combination of parameters) on one side of the relation (e.g.,  and ) has a corresponding constrained estimate for the parameter (or combination of parameters) on the other side of the relation (e.g.,  and ). Similar tendencies are seen for the remaining eigenparameters (Fig. S7).

In addition to identifying compensation effects between subsets of parameters (Figs. 5 and S7) that lead to similar model outputs (Fig. 2B), analysis of sloppiness also reveals that prioritizing improvement of the estimates of any of the parameters (or parameter combinations) on one side of the proportionality relationship will immediately improve estimation of parameters (or parameter combinations) on the other side. As an example of this, we considered a multivariate log-normal prior distribution (Fig. S8), that is very informative for parameters , , , to fit the ecosystem network model to data. These prior conditions act as improved estimates for parameters , and , obtained either from expert elicitation or parameter-specific experiments (e.g., spotlight counts[44]). After the model-data fit (Fig. S9), parameters on the other side of the relations (, and ) are also found to be more narrowly constrained. The percentage coefficients of variation () for the posterior distributions of parameters , and range between when a more informative prior distribution is specified for parameters , and , which are much lower than those obtained (ranging between ) when a vague multivariate log-normal prior distribution is instead specified (Fig. S5). Thus, combining Bayesian inference together with the analysis of sloppiness reveals parameter interdependancies that can be strategically exploited to efficiently improve individual parameter inference using less additional data than might be otherwise expected.

Case study 2: Cardiac Electrophysiology

Key controlling mechanisms for complex systems are uncovered by analysis of sloppiness

The previous section considered an ecological model as an example of a system characterized by a moderately large number of parameters, and poor access to data. A separate class of systems are those for which data is more readily available, but the dynamics that produce the data manifest in complex sensitivities to their controlling parameters. For these systems, the challenge is often how to summarize these nonlinear dynamics in a meaningful, actionable way, and so stiff eigenparameters identified by analyzing model sloppiness have a recognizable potential. However, so far model sloppiness has primarily been considered for models characterized by large numbers of fundamental interactions, such as the Michaelis–Menten kinetics that describe the cell signalling network analyzed in the foundational work of Brown et al. [14, 15]. Here, we seek to demonstrate the usefulness and purpose of stiff eigenparameters in systems where the constituent dynamics themselves, and not only their interactions, are complex and unwieldy.

As an example of such a system, we consider the Beeler–Reuter (BR) model [45], which describes the action potential (AP) of a cardiac ventricular myocyte, the pattern of highly regulated ion flow that creates the depolarization and subsequent repolarization governing the heartbeat. This cardiac cell model consists of eight nonlinear ordinary differential equations, six constitutive equations (Table S3), and nine parameters (Table S4). Although an older model, the BR model captures many of the most important electrophysiological features of the ventricular AP [52], and interest remains regarding its sensitivity to changes in its parameters [26, 9]. Cardiac AP models are critical for mechanistically understanding arrhythmia [53], and the issue of parameter variability is fundamental to understanding the differential effects of antiarrhythmic treatments within a population [54] or the cardiotoxicity of other pharmacological agents [55].

The AP is summarized by the time course of a cell’s transmembrane potential in response to stimulation, and can be recorded by an electronic measurement device at good temporal resolution and without much noise (e.g., synthetic data in Fig. 5A). The complexity in these models rests with the way multiple ion channels, each with its own set of time-adaptive, nonlinear voltage-gated dynamics, combine additively to determine the total ion flow that produces the AP (Table S3). The most commonly varied parameters are the relative levels of expression for these different ion channels [56], and so rather than describing fundamental quantities such as rates of production or destruction, model parameters in this context describe the extent to which a variety of complex and strongly nonlinear dynamics contribute to the system behavior.

For this system, Bayesian model-data fitting produces an ensemble of plausible values for model parameters that recapture the data extremely well (Fig. 5A). Most of the individual parameters are well-constrained by the AP data (as seen from their marginal distributions, Figs. S10), although none emerge as substantially more important than all others. Analyzing model sloppiness to consider parameters in terms of their combinations, however, reveals that the combination of parameters is the primary driver of the AP dynamics. This eigenparameter’s corresponding eigenvalue eclipses the value of the others (Fig. S11), and accordingly, its value is extremely well specified by the population of plausible parameter values (Fig. 5B). This eigenparameter and its relative importance are identified by both the standard and Bayesian approaches for model sloppiness, owing to the use of a relatively uninformative prior and the fact that the data are highly informative about the model parameters. Unlike the Michaelis-Menten kinetics or the ecosystem network model examples, here all approaches for model sloppiness are similarly suitable due to the strong informativeness of the data relative to the prior.

The key eigenparameter has a natural electrophysiological interpretation. Parameters and describe the relative strengths of the primary outward and inward (i.e., counteracting) currents active during the plateau and repolarization phases that compose the bulk of the AP (Fig. 5A), and so they appear in the eigenparameter as a ratio. Here, the third parameter contributes to the eigenparameter to a lesser extent and appears as a product with parameter , owing to their shared role in describing strengths of the outward potassium currents that drive repolarization. The three currents , and  (Table S3), associated with these three model parameters (, , and , respectively), exhibit nonlinear dynamics (Fig. S12). Thus, it is surprising how well the primary actions of these three currents (, and ) can be summarized by a simple product of parameters with exponents (), whose value strongly dictates whether or not the model output reproduces the data (Fig. 7).

Analysis of model sloppiness naturally uncovers this result, by revealing the precise way in which the three currents , and  (Table S3) act together and thus highlighting the importance of their balance by assigning a much higher eigenvalue to their eigenparameter than any other. Without considering the curvature of the log-parameters, however, this relationship is not easily observed. The precise relationship between , and remains hidden from view in standard Bayesian bivariate analysis [26, 9], and even when directly plotting the values of posterior samples for these three parameters against one another (Fig. S13). Such a relationship is also not obvious from the model definition, where none of the three parameters , and appear as products or quotients with one another, nor do the coefficients of their addition correspond to the exponents found in the governing eigenparameter.

Analysis of sloppiness uncovers knowledge limitations in mathematical models fitted to data

As also observed in the ecological application (Fig. 5), the existence of a strong eigenparameter(s) introduces a pronounced structure to the space of plausible parameter sets (Fig. 8). The nature of the eigenparameter implies a strong linear interdependency between combination of parameters and parameter , as seen in the posterior samples found by the Bayesian inference (Fig. 8). Identifying these critical structures introduced by the model-data fitting process is key to understanding the information provided by the data on the model parameters. Cardiac electrophysiology is a particularly important example as parameter identifiability is a well-established issue for AP models [9, 6]. Thus, owing to sloppiness in parameter calibration such as that discovered and quantified here, even perfect AP data (Fig. 5A) can imply multiple different calibrations (Figs. 6B and S14), with consequences that then emerge under pathological conditions or in response to drug treatments [57].

As in many other disciplines, in cardiac electrophysiology, it can be difficult to design further experiments and/or to target experiments to learn specific parameter values. To this end, our comprehensive approach to model sloppiness does uncover the deficiencies in the available information through the identification of the critical eigenparameters. For example, once these critical eigenparameters are identified, the model can be used to simulate scenarios considering extreme system conditions (Fig. S14) that are theoretically still plausible given current data (Fig. 6B). In fact, this concept might be even more applicable where a model’s computational runtime limits the feasibility of Bayesian inference. Even when a posterior of plausible parameter value sets cannot be realistically generated, standard analysis of sloppiness can still quickly identify directions in parameter space of poor information. In this way, simulations can be carried out along directions of poor knowledge (e.g., perpendicular to the linear relationship depicted in Fig. 8) to further justify the conclusions of simulation studies against the uncertainty that remains in the parameters after the model calibration to data.

Indeed, ever-present knowledge limitations about parameter values in cardiac electrophysiology has motivated studies in which virtual populations consisting of many models with varying parameter values are used to explore how populations as a whole, characterized by variable data, respond to different treatments or conditions [54, 55]. This has included a Bayesian methodology for calibrating such populations [8]. Our new Bayesian framework of model sloppiness, that provides a more global sense of parameter space curvature in the plausible region defined by the given data (Fig. 8), could in fact be applied to the “posteriors” of such population-calibration processes, and thus provide a unique way to identify the combinations of parameters that are constrained (or not constrained) by the process of calibrating these models to data exhibiting variability.

Discussion

Recognizing the influence of prior information on the quality model-data fit

The use of informative priors has been shown to help constrain model parameters when mathematical models are fitted to data in many Bayesian inference applications [35, 36, 37, 58, 59, 8, 26]. Despite this advantage, the usage of uninformative priors has predominated in the context of analyzing model sloppiness [14, 15, 16]. In such a context, vague uniform priors, spanning several orders of magnitude, have been used to prevent the model-data fitting algorithm to become insensitive to parameter changes [13, 14, 17], rather than reflecting their true purpose: accounting for pre-existing knowledge about the parameter values [36, 38, 12]. Here, we introduced how to account for informative priors when analyzing model sloppiness, with our example results illustrating how this new approach identifies the relative effect of informative priors on the quality of the model-data fit. Specifically, the LIS method (matrix ) was shown to reveal directions in parameter space where the posterior differs most strongly from the prior (Fig. 1B) while the posterior covariance method (matrix ) was shown to reveal directions in parameter space that are strongly informed by the combination of both data and priors (Figs. 1A and 4). Additionally, our Bayesian analysis of sloppiness (matrices and ) was shown to provide equivalent results to those based on earlier approaches (matrices and ) when uninformative (vague) priors are used in the implementation of Bayesian inference (Table 1) and when the data are very informative for the model parameters (Fig. 6 and S11). Consequently, we have demonstrated that our Bayesian approach to analyzing sloppiness complements earlier approaches [14, 15] in that the effects of prior beliefs on the quality of the model-data fit can be segregated when all methods are used together. This then clarifies which of the model parameters (or parameter combinations) are predominantly informed by the data or the prior.

In the motivating Michaelis-Menten kinetics example and the cardiac electrophysiological application, we specifically showed that if stiff eigenparameters obtained from all methods (matrices or , , and ) are similar, priors are weakly informative and so stiff eigenparameters are largely constrained by the data (Table 1, Scenario 1 and Fig. S11). We also showed in the motivating example that if stiff eigenparameters obtained from the LIS method (matrix ) are similar to those obtained from the standard method (matrices or ) but different from those obtained from the posterior covariance method (matrix ), then critical parameter combinations associated with the posterior covariance method (matrix ) are significantly influenced by the priors (Table 1, Scenario 2). Finally, we showed in the same motivating example that if stiff eigenparameters obtained from the standard method (matrices or ) differ from those obtained from the Bayesian methods (matrices and ), then priors may also be influencing the quality of the model-data fit. Under such conditions, stiff eigenparameters obtained from the LIS method (matrix ) are informed by the data relative to the prior and those from the posterior covariance (matrix ) are mostly constrained by the prior (Table 1, Scenario 3). (A topographical interpretation of these findings is also provided in the next section.) In this way, we also demonstrated that our methods are well-suited for applications where there is little prior knowledge about the parameter values [14, 15, 29, 16] but also for those where prior beliefs can be confidently included as part of the model-data fitting process [35, 36, 58, 59, 8, 6].

Characterizing the topography of the surface described by the model-data fit

Our work significantly adds to the literature on sensitivity analysis [19, 22], which in the context of models fitted to data largely focuses on “locally” investigating the curvature of the surface described by the likelihood function [29, 14, 13, 17, 16], around the best-fit parameter values (MLE). Thus, a key contribution of our Bayesian approach to analyzing sloppiness is that it accounts for changes in the curvature of this surface “globally” upon considering potentially plausible model parameter sets at a finite distance away from the best-fit parameter values [23, 8, 26]. In addition to this, our implementation of Bayesian inference advances upon earlier such implementations for analyzing model sloppiness [14, 15]

. In these earlier works, a type of Markov Chain Monte Carlo (MCMC) algorithm, with uninformative priors and started at the set of best-fit parameter values (MLE), was used to characterize the likelihood surface in the vicinity of the pre-identified MLE

[34, 16]. However, we have shown here that different MLEs can lead to completely different locations on the surface describing the likelihood function (Figs. 5 and S7), which can mislead inference of the stiff eigenparameters (Table 2). Instead, Bayesian inference as implemented here does not require a set of best-fit parameter values to characterize the surface that describes the quality of the model-data fit, with the added value that our Bayesian sensitivity matrices acknowledge any effect of prior information on the most plausible region in parameter space (Figs. 2, 4 and 5B).

In our example results, we specifically illustrated that comparison of stiff eigenparameters obtained from both the standard (matrices or ) and Bayesian (matrices or ) methods can reveal whether the topography of the surface described by the likelihood function is globally and locally similar as well as if such a surface is similar to that described by the posterior distribution. For example, if stiff eigenparameters obtained from all methods (matrices or , and ) are similar, the shape of the surface described by the likelihood function and posterior distribution are locally and globally similar (Table 1, Scenarios 1 and Fig. 5B). Instead, if stiff eigenparameters obtained from the standard methods (matrices or ) are similar to those obtained from the LIS method (matrix ) but differ from those of obtained from the posterior covariance method (matrix ), the shape of the surface described by the likelihood function is locally and globally similar but different from that described by the posterior distribution (Table 1, Scenario 2 and Table 2, with matrix or evaluated at ). Alternatively, if stiff eigenparameters obtained from all methods (matrices or , and ) are different, the shape of the surface described by the likelihood function is locally and globally different, but also different from the surface described by posterior distribution (Table 1, Scenario 3 and Table 2, with matrix or evaluated at ). We note, however, that while differences between the shape of surfaces describing the posterior distribution and likelihood function are associated with effects of priors on quality of the model data-fit, identifying whether the likelihood is locally and globally similar is crucial when multiple (but also very different) parameter sets lead to a similar values of the likelihood function (Fig. 5). This is a situation that is likely to occur when there is limited data to inform model parameters [60, 61], for which our Bayesian sensitivity matrices are thoroughly informed by the data and the prior.

Improving parameter identifiability by designing experiments based on identified parameter interdependencies

Careful experimental design can improve ambiguous parameter inferences or even biased model predictions [29, 16, 41]. In the context of analyzing model sloppiness, much effort has been devoted to studying effects on parameter identifiability by increasing the quality and quantity of the data used to fit the model [31, 62, 32, 61]. For example, Apgar et al. [32] carefully designed complementary experiments that constrained parameter values in the model of Brown et al. [15, 14]. To achieve this, they modified some of the model inputs to create different synthetic datasets that were more informative for some of the model parameters than others, but when used together, all model parameters could be constrained within of their true values. However, these computational experiments still required considerably more data than those typically obtained in practice [29, 16]. Instead, this work showed that the identification of critical parameter interdependancies may more efficiently improve parameter inference when prior knowledge about related (interdependent) model parameters is strategically improved through expert elicitation or new experiments (Fig. S8).

We also showed in the cardiac electrophysiological application that if experiments are designed to modify parameter values as well as the values of the stiff eigenparameters, these new experiments are likely to provide new information about the system (Fig. 6A). However, if experiments are designed to change parameter values and instead keep the values of the stiff eigenparameters approximately constant, these new experiments are unlikely to provide new information about the system (Fig. 6B). We note that if the design of parameter-specific experiments is not practically possible [31, 32, 61], the posterior covariance matrix (inverse of matrix ) can still be used to measure the increase in parameter identifiability obtained by increasing quantity and quality of data, recently used in optimal Bayesian experimental design [63].

Recognizing knowledge limitations in mathematical models fitted to data

Regardless of how good a mathematical models is, its predictions are only as useful as its known limitations [59, 2, 64]. Here, by recognizing knowledge limitations in mathematical models fitted to data, our work also adds to the literature of model development and simulation [3, 2, 10, 4]. For example, the identified stiff parameter combinations in the cardiac electrophysiological application uncovered a hidden controlling mechanism of the system (Fig. 5B) that dictated the success or failure of the model output accurately reproducing the experimental data (Fig. 7). In practical applications, identifying this type of model behavior would inform which of the model parameters are important for model reduction [28, 27] or need to be prioritized in future experimental designs [32, 31, 41]. Furthermore, the implementation of Bayesian inference for the model-data calibration brings the added benefit of quantifying the uncertainty in both parameter values (Figs. S1A-S3A, S5, S8 and S10) and model predictions (Figs. 2B, 5A, S1B-S3B and S9). As such, this work constitutes an example of how this advanced model-data fitting technique can be exploited to reveal the hidden geometry of parameter uncertainty and its effects on model predictions: a topic of growing interest within many fields of science [2, 5, 3, 19] that has thus far been hindered due to concerns about system complexity and limited data accessibility [59, 2, 64].

Materials and Methods

To assist with the subsequent mathematical description, we first summarize how sloppiness of a model is analyzed in its standard, non-Bayesian, form. Then, we describe how it can be analyzed via our Bayesian framework. Finally, we describe the procedure followed in the Results to identify the stiff eigenparameters according to both standard and Bayesian approaches to analyzing model sloppiness.

Standard (non-Bayesian) approach to analyzing sloppiness

The standard approach to analyzing sloppiness involves three key steps [14, 15]:

1. [nolistsep]

2. estimating the best-fit parameter values by fitting the model to data;

3. calculating the sensitivity matrix evaluated at the best-fit parameter values , and

4. identifying the eigenparameters that are more or less sensitive to the model-data fit through eigendecomposition of the sensitivity matrix .

We detail these three steps for analyzing sloppiness using the standard approach as follows.

Step 1. Estimating values of the model parameters:

Let us assume that a model parameterized by a vector

of dimension has been proposed to explain a dataset that consists of independent observations , so represents the th observation in this dataset, . It is commonly expected that the model does not fit the data perfectly; to account for this expectation the measurement error in the data

is often assumed to follow a Gaussian distribution, with mean of zero and standard deviation

for each observation [14, 32, 18]. If measurement errors are assumed to be Gaussian, the likelihood that a model with parameters fits the data takes the form:

 (2)

where is the model’s prediction of an equivalent noiseless observation for given parameters . The values of the model parameter vector that maximize the likelihood function are altogether called the maximum likelihood estimator (MLE). The MLE is often referred to as the set of “best-fit parameter values” and is here denoted as [31]. We note, however, that a standard least-squares regression may be cast as maximizing a Gaussian likelihood by introducing [16, 14]:

 C(θ)=−logL(yobs|θ)=12Nobslog(2π)+12Nobs∑k=1log(σ2k)+12Nobs∑k=1(yobs,k−ymodel,k(θ)σk)2, (3)

where is the cost function. The first two terms in Eq. 3 are independent of the parameter values, so . Thus, minimizing the cost function in Eq. 3 to find the best-fit parameter values is equivalent to maximizing the Gaussian likelihood function in Eq. 2 to find the MLE.

Step 2. Calculating the sensitivity matrix:

The standard approach to analyzing sloppiness obtains the sensitivity matrix by investigating how the cost function in Eq. 3 varies with respect to the parameter vector in the vicinity of the maximum likelihood estimator . To do so, this matrix is obtained by a Taylor expansion of around the best-fit parameter values while differentiating with respect to the logarithm of the parameters , which yields [16, 15, 32]:

 C(θ)≈C(θ⋆)+∇C(θ⋆)⋅(logθ−logθ⋆)+12(logθ−logθ⋆)⊤H(logθ−logθ⋆), (4)

where the gradient of the cost function is zero at the best-fit parameter values by definition so that the sensitivity of the model fit to changes in parameter values is characterized by the Hessian matrix defined in Eq. 4, whose elements are given by [14, 15]:

 Hij=−∂2logL(yobs|θ)∂logθi∂logθj∣∣∣θ=θ⋆, (5)

with and both taking integer values ranging from to . Thus, the Hessian matrix describes the quadratic behavior of the cost function infinitesimally close to the point , and thus it is considered here as one of the matrices that could be used as the sensitivity matrix for analyzing model sloppiness [14, 32, 16]. However, since evaluating second-order derivatives can be computationally expensive, the sensitivity matrix can also been approximated by the Levenberg-Marquardt Hessian at a much lower computational cost, following [14, 34, 16]:

 Lij=Nobs∑k=1∂rk∂logθi∂rk∂logθj∣∣∣θ⋆=θ⋆MLE, (6)

where the residual error for the -th observation is calculated via , and the first derivatives in Eq. 6 can be evaluated by first-order finite differences or by integrating sensitivity equations for ordinary differential equation (ODE) models [13, 31]. The Levenberg-Marquardt Hessian is equal to the observation information matrix evaluated at the MLE, which itself is a sample-based version of the Fisher information matrix (FIM) [27, 34], whose connections with information theory have been well considered elsewhere [27, 28, 16, 3]. The Levenberg-Marquardt Hessian thus constitutes a more computationally convenient sensitivity matrix for analyzing sloppiness, although as with the Hessian matrix , only considering the curvature of the likelihood surface infinitesimally close to the MLE.

Step 3. Identifying the eigenparameters that are more or less sensitive to the model-data fit:

To identify the stiff eigenparameters, eigenvalues and eigenvectors are obtained via eigendecomposition of the sensitivity matrix

, or via singular value decomposition if numerical stability is an issue

[14]. Each of the eigenvectors are mutually orthogonal so that eigenparameters can be conveniently expressed as linear combinations of natural logarithms of model parameters, following [15]

 αn=Nθ∑j=1(vn)jlogθj, (7)

where is the th element of the - eigenvector of the sensitivity matrix. Thus, each eigenparameter can be simply represented as the product and/or quotient of bare model parameters raised to an index given by the elements of eigenvector , by rewriting Eq. 7 as [15]

 ^θn:=exp(αn)=Nθ∏j=1θ(vn)jj, (8)

with stiff eigenparameters associated with the largest eigenvalues and sloppy (soft) eigenparameters associated with the smallest eigenvalues. The magnitude of each element of the eigenvector in Eq. 8 therefore indicates the relative contribution of bare parameter to eigenparameter . If eigenvectors are normalized, each takes values between - and inclusive, so that all are products of bare parameters having exponents with magnitudes that do not exceed unity. Any factors in Eq. 8 having relatively low magnitudes for (e.g., ) contribute little to the eigenparamter’s value, thus these small factors can be practically excluded from the product [14]. Hence, each of the eigenparameters may depend strongly on only a few bare parameters that may be importantly related to each other.

Bayesian approach to analyzing sloppiness

In the context of model-data calibration, Bayesian inference provides a coherent statistical framework to estimate probability distributions for model parameters, constrained by the combination of data and prior beliefs

[23, 36]

. Thus, if model-data calibration is recast as a Bayesian inference problem, the final estimates for the probability distribution of parameters

, based on all of the data , is called the posterior distribution . To apply Bayesian inference, we require definition of both a likelihood function and a prior distribution . An example of the former of these was defined in Eq. 2 (i.e., Gaussian likelihood) while the latter of these represents the initial beliefs about the parameter values, which are often based on earlier studies, or in their absence, they are based on expert knowledge [36, 65]

. Once both likelihood function and prior distribution are defined, Bayes’ theorem is then used to obtain the posterior distribution, following

[36]

 π(θ|yobs)∝L(yobs|θ)p(θ). (9)

Here, is often difficult to calculate directly and therefore several methods have been developed for the posterior computation, including Markov Chain Monte Carlo (MCMC) sampling [66], Sequential Monte Carlo (SMC) sampling [67], Approximate Bayesian Computation (ABC) [68], Variational Bayesian Inference [69], Laplace Approximation [70], and many others. For the purposes of this section, we simply assume that the posterior has been successfully sampled, and thus we hereafter discuss practical aspects of computing the sensitivity matrix within a Bayesian framework. Thus, analogous to the standard approach to analyzing sloppiness, the Bayesian approach consists of three steps:

1. [nolistsep]

2. obtaining an estimate of the posterior distribution by calibrating the model to data;

3. calculating a Bayesian-based sensitivity matrix from the posterior distribution , and

4. identifying the eigenparameters that are more or less sensitive to the model-data fit through eigendecomposition of the Bayesian-based sensitivity matrix .

Exact details of the first step above depend on the posterior-computation method chosen while the third step is the same as the third step of the standard approach. Thus, we focus here on the second step, for which we introduce two different methods to calculate the sensitivity matrix. These are described as follows.

i. Posterior covariance method:

The posterior covariance method is based on the application of Principal Component Analysis (PCA)

[40]. This technique uses eigendecomposition of a sensitivity matrix (a covariance matrix) to reduce the dimensionality of large datasets, which thus identifies the dataset components that account for the largest amount of variance [71, 72]. In our context, the dataset of interest is a Bayesian ensemble of plausible parameter values, which we obtain from the posterior distribution for the parameters. Thus, if PCA is applied on this specific dataset, eigenvectors and eigenvalues of the posterior covariance matrix inform the variability of the model-data fit to changes in parameter values. However, given that we seek to identify the eigenparameters that are well-constrained by the available data (i.e., those that have less variability), we instead calculate the sensitivity matrix as the PCA Hessian matrix that is based on the inverse of the posterior covariance matrix [14],

 P=Σ−1, (10)

where the matrix is calculated in terms of the natural logarithms of model parameters , with this transformation required in Eq. 8 to characterize stiff/sloppy eigenparameters as products or quotients of the bare model parameters. Here, eigendecomposition of the PCA Hessian matrix identifies which eigenparameters are more or less constrained by the combination of both data and prior beliefs. Specifically, eigenvectors of matrix with large eigenvalues indicate stiff eigenparameters, while eigenvectors with small eigenvalues indicate sloppy eigenparameters.

We note that if Monte Carlo methods such as MCMC sampling [66], SMC sampling [67] or ABC [68] are used to approximate the posterior as a set of equally weighted samples , the required posterior covariance matrix , calculated with respect to the natural logarithms of parameters , can be estimated using the sample covariance matrix ,

 Σ≈^Σ=1M−1M∑m=1(logθm−log¯θ)(logθm−log¯θ)⊤, (11)

where is the estimated posterior mean for the natural logarithm of parameters. If Monte Carlo methods are overly computationally expensive, fast approximate methods such as Variational Bayesian Inference or Laplace Approximation [70] can be used as an alternative to provide a rapid estimate of the posterior covariance matrix. However, these fast approximate methods provide a rapid, albeit possibly biased, estimate of the posterior covariance matrix [70].

ii. Likelihood-informed subspace method:

The likelihood-informed subspace method proposed here has its origins in the Bayesian parameter reduction literature; specifically, from the work of Cui et al. [42] who developed a method for Bayesian inverse problems that identifies the directions in parameter space where the data are most “informative” relative to the prior. The motivation for Cui et al. [42] was to develop an approximate but accelerated MCMC algorithm that samples over a lower-dimensional subspace, called the likelihood-informed subspace (LIS), to avoid sampling from directions of prior variability that the likelihood does not inform [50]. The LIS is constructed on the idea that the Hessian of the log-likelihood can be compared to the prior covariance to then identify directions in parameter space along which the posterior distribution differs strongly from the prior, i.e., directions that are likelihood-informed [73]. Thus, we adapt here the approach used by Cui et al. [42] to construct the LIS to define a sensitivity matrix in our context.

Our goal is to make the sensitivity matrix depend primarily on the data and eliminate effects of the prior distribution. To achieve this, we firstly assume that the covariance matrix of the prior distribution for the logarithms of parameters is known, and that this matrix can be Cholesky factored to a lower triangular matrix such that . Then, by following Cui et al. [42], we define the prior-preconditioned Hessian matrix as:

 Ψ(θ)=L⊤pH(θ)Lp, (12)

for any arbitrary parameter vector . Here, elements of are given by Eq. 5, evaluated at an arbitrary parameter vector . We note that Cui et al. [42] used a multivariate Gaussian prior to define the prior-preconditioned Hessian matrix in Eq. 12, which is needed in that context to approximate the posterior distribution as the product of a lower-dimensional posterior defined on the LIS and the prior distribution marginalized onto a complementary subspace [50, 73]. However, given that our purpose is to identify the directions that are data-informed, and not to approximate a posterior distribution, the LIS definition is not restricted to multivariate Gaussian priors in our application. Thus, we obtain an expression for the LIS, used here to define the LIS-based sensitivity matrix , by integrating over the prior-preconditioned Hessian matrix with respect to the posterior, which yields [42]

 G=∫θΨ(θ)L(yobs|θ)dθ. (13)

Practically, given that Eq. 13 involves an integral over , if the posterior is approximated by a Monte Carlo method (e.g., MCMC, SMC or ABC) as a set of equally weighted samples , the LIS-based sensitivity matrix can instead be estimated as

 G≈1MM∑m=1Ψ(θm), (14)

where each is calculated via Eq. 12 with the Hessian matrix of the negative log-likelihood function evaluated at each posterior sample via Eq. 5. Further, we note that these matrices are all left-multiplied by and right-multiplied by , with the resulting matrices averaged to obtain the sensitivity matrix . As a result, eigendecomposition of this prior-informed sensitivity matrix can reveal which eigenparameters are strongly informed by the data relative to the prior, i.e., directions in parameter logspace where the posterior differs most strongly from the prior [73].

Demonstrating how to analyze model sloppiness using examples of models fitted to synthetic data

In this section, we describe the six-step procedure used to analyze model sloppiness in the examples discussed in the Results. This six-step procedure incorporates both approaches discussed above, i.e., the local sensitivity analysis around the best-fit parameter values (Standard approach) and the global sensitivity analysis considering all plausible parameter values consistent with the available data (Bayesian approach). Each step of the procedure describes specific details of the examples considered in the Results.

i. Defining the model form:

We consider models of the form where is a vector of input conditions (e.g., representing the spatial or temporal location at which the model is evaluated, and/or the external conditions that alter the model output), is a vector of model parameters, and is a vector of model outputs. Here, and are the number of model inputs and outputs, respectively.

ii. Generating synthetic data to fit the model:

We generate measurement data for the motivating example and ecological application by adding heteroscedastic noise with variance proportional to the observation, which follows a truncated normal distribution

with mean , standard deviation and lower truncation bound of zero on each of the synthetic observations [18]. Here, is the vector containing the reference (true) values for the model parameters, is a user-defined measurement error ranging between , and noise is added to the model output associated with the set of input conditions. Alternatively, we generate measurement data for the cardiac electrophysiological application by adding homoscedastic noise, which follows a normal distribution with mean and constant standard deviation [9]. In each example, measurement error and sampling frequency (number of measurements) are chosen according to typical experimental conditions.

iii. Defining the vector of unknown model parameters and their prior distributions:

We define the vector of unknown model parameters consisting of: (i) the model parameters, (ii) model initial conditions (only considered in the ecological model), and (iii) measurement error or standard deviation following the type of noise added to the synthetic data. Then, we specify prior distributions for the parameters using either positive uniform or multivariate log-normal probability distributions [9, 26, 59, 38, 49], as follows.

In the Michaelis-Menten kinetics example, three different joint prior distributions are used for the three parameters , and of the model and measurement error . The first joint prior consist of a uniform prior for each parameter; the second joint prior consists of multivariate log-normal priors for all parameters, with the prior of parameter being badly specified; and the third joint prior consist of a uniform prior for , a badly specified log-normal prior for , and well-specified log-normal prior for and . All joint priors assume independence between the model parameters and measurement error, so . Here, a badly specified prior for the - parameter means that this parameter’s true value has a very small (but non-zero) probability before consideration of the data. Alternatively, a well-specified prior means that the true parameter value has a high probability before consideration of the data.

In the ecological application, two different joint prior distributions are used for the twenty parameters of the ecosystem network model (Table S2) and measurement error . The first joint prior distribution is chosen to be a product of vague log-normal distributions for each parameter so that this joint prior possesses zero covariance. Alternatively, the second joint prior distribution is chosen to be a product of well-specified log-normal distributions for parameters , and and vague log-normal distributions for each of the remaining parameters, including the measurement error . As discussed in the Results, well-specified priors for parameters , and are chosen based on the stiff eigenparameters, identified from the analysis of sloppiness for the case considering vague log-normal distributions for each parameter.

In the cardiac electrophysiological application, a well-specified multivariate log-normal prior distribution is used for the nine parameters of the Beeler–Reuter (BR) model (Table S4) and the standard deviation . This joint prior distribution is centered at the reference parameter values and assumes zero covariance between the nine parameters and the standard deviation . Stimulation conditions , and , membrane capacitance , and initial conditions , , , , , , and are set to their reference values (Table S4), and are not estimated via our model-data fitting techniques.

iv. Calibrating the model to data:

We use two approaches to fit each example model to data, with the first being maximum likelihood estimation (MLE) and the second being Bayesian inference. To implement these two approaches, we conveniently rewrite the Gaussian likelihood function defined in Eq. 2, as

 L(yobs|θ)=Ny∏j=1Nx∏i=11√2πσyj(xi)exp⎡⎣−12(yobs,j(xi)−ymodel,j(xi,θ)σyj(xi))2⎤⎦, (15)

where when heteroscedastic noise is used to generate the synthetic data while when homoscedastic noise is instead used. Then, we use this Gaussian likelihood function and specified prior distributions (Step iii) to approximate the joint posterior distribution via Bayes’ theorem (Eq. 9) by implementing the SMC sampling algorithm adapted from Adams et al. [18, 59]. In our implementation of this posterior sampling algorithm, we use a sample size of , Metropolis-Hastings acceptance fraction of and effective sample size reduction target equal to . These settings were sufficient for reproducible sampler performance: results did not vary in independent runs of the sampling algorithm using a smaller sample size of and larger effective sample size reduction target of (results not shown).

Once the joint posterior probability distributions

are obtained for each example, we estimate the best-fit parameter values (maximum likelihood estimator) by minimizing the cost-function with given by Eq. 15 while using the posterior mean as the initial guess to start the optimization. Here, the sets of best-fit parameter values , and are only used to calculate the sensitivity matrices ( or ) via the standard approach to analyzing sloppiness while the already obtained prior and posterior distributions are used to calculate the sensitivity matrices ( and ) based on the Bayesian approach.

v. Calculating the sensitivity matrix:

Eqs. 5, 6, 10 and 14 are used to calculate the Hessian , the Levenberg-Marquardt Hessian , the PCA Hessian , and the likelihood-informed subspace , respectively. Each of these matrices acts as a sensitivity matrix for the purpose of analyzing sloppiness. In the absense of analytical derivatives, we use central finite differences [74] to approximate first and second order derivatives of the log-likelihood function with respect to the logarithm of parameters, with a step size where is a small scalar between and . Rows and columns of each sensitivity matrix characterizing sensitivity of the model-data fit to the measurement error and standard deviation are not calculated, as eigenparameters containing and are either trivial or difficult to interpret. As a result, the dimension of the square symmetric sensitivity matrices , , , and obtained here is equal to the number of model parameters, excluding the measurement error or standard deviation , i.e., we obtain sensitivity matrices in the Michaelis-Menten kinetics example, in the ecological application, and in the cardiac electrophysiological application.

vi. Identifying stiff eigenparameters:

Eigenvalues and eigenvectors of the sensitivity matrices , , and

are calculated via singular value decomposition

[14]. Then, eigenparameters are obtained via Eq. 8, in which we consider the contribution of parameter to eigenparameter only when element of the normalized eigenvector satisfies (see “Standard (non-Bayesian) approach to analyzing sloppiness”). We also rescale exponents of the bare parameters associated with each eigenparameter , so that the magnitude of the largest/smallest index for every eigenvector is . Here, eigenvalues are ordered from largest to smallest, so that the corresponding eigenparameters are also ordered from stiffest to sloppiest.

Trade-offs of locally and globally analyzing model sloppiness

To summarize these methods, we have introduced a comprehensive approach to obtain locally and globally the key quantity for analyzing model sloppiness – the sensitivity matrix (, , , and ). This approach accurately estimates uncertainty in parameter values, constrained by the combination of prior information and data, with the key benefit of robustly identifying the relative effect of this prior information in the inference of critical parameter combinations that control the quality of the model-data fit. This is a key achievement of this work as it extends the application of the analysis of sloppiness beyond systems where there is little prior knowledge about the model parameter values [14, 13] to those where prior information is more readily available [38, 35, 48, 59], and thus can be confidently incorporated as part of Bayesian model-data fitting process to constrain parameter values [8, 26, 9, 18].

In the computational implementation of this approach, the local (standard) methods to analyzing sloppiness (matrices or ) were found to be computationally inexpensive in comparison to the Bayesian methods (matrices and ). Nevertheless, since local analysis of sloppiness considers a single point estimation in parameter space (i.e., the best-fit parameter values), this local approach can only accurately quantify the model sensitivity to parameter changes when the likelihood function maximum (or cost function minimum) is well defined [61, 34]. Unfortunately, if the surface described by the likelihood function is relatively complicated (with ridges), this method can mislead inference of stiff eigenparameters (see, for example, Table 2) [16]. Despite this limitation, local analysis of sloppiness can still be very useful in model-data fitting applications where computationally expensive models make implementation of Bayesian inference impractical.

Alternatively, the global (Bayesian) analysis of sloppiness looks beyond the curvature of the likelihood function surface at a single point while fully exploring the topography of this surface by using an ensemble of plausible parameter values to characterize the sensitivities of the model-data fit to changes in parameter values. As part of the global approach, we introduced the posterior covariance method (matrix ), inspired by the application of PCA [40], that assesses the data informativity about the critical parameter combinations while accounting for (including) any prior information about parameter values. This method does not require approximating gradients of the log-likelihood (Eq. 10), so it is computationally inexpensive after the posterior distribution is obtained via Bayesian Inference. However, since the posterior covariance method (matrix ) assumes that the posterior structure is well captured by a covariance matrix of the logarithm of parameters , it is also restricted to applications where the posterior distribution for is approximately multivariate normally distributed. Despite this limitation, the posterior covariance method has the added benefit of being readily applicable to all kinds of statistical models, even to those with computationally intractable likelihood functions where “likelihood-free” Bayesian methods, such as Approximate Bayesian Computation [68] and Bayesian Synthetic Likelihood [75], are prevalent.

As part of the global approach to analyzing sloppiness, we also introduced the likelihood-informed subspace (LIS) method (matrix ), inspired by a Bayesian dimension reduction technique [42], that assesses the data informativity about the critical parameter combinations while instead acknowledging and excluding any prior information. The LIS method involves approximation of second-order derivatives (Eq. 14), and thus differentiation of computationally expensive models via finite differences [74] can make its implementation computationally expensive; a condition under which automatic differentiation may be instead preferred [76, 34, 30]. Despite this limitation, the LIS method does not assume a given shape for the posterior distribution, and thus the method has the added benefit of being readily applicable to systems with non-Gaussian posterior distributions.

Hence, our comprehensive approach to analyzing model sloppiness does comprise a suitable set of tools to aid understanding of many of nature’s systems, ranging from a single cell in the human body [8, 9] and the myriad of microorganisms found almost everywhere [3, 5, 6] to large ecosystem networks [18, 2] and beyond [11, 10], through the simultaneous usage of experimental data, mathematical models, and computer simulation.

References

• [1] A. Ma’ayan, Journal of The Royal Society Interface 14, 20170391 (2017).
• [2] W. L. Geary, et al., Nature Ecology & Evolution (2020).
• [3] A. F. Villaverde, J. R. Banga, Journal of The Royal Society Interface 11, 20130505 (2014).
• [4] A. B. Shiflet, G. W. Shiflet, Introduction to computational science: Modeling and simulation for the sciences (Princeton University Press, Princeton, N.J., 2006).
• [5] N. Mouquet, et al., Journal of Applied Ecology 52, 1293 (2015).
• [6] C. C. Drovandi, A. N. Pettitt, Biometrics 67, 225 (2011).
• [7] T. Schlick, Molecular modeling and simulation: an interdisciplinary guide, vol. 2 (Springer, 2010).
• [8] B. A. J. Lawson, et al., Science Advances 4, e1701676 (2018).
• [9] R. H. Johnstone, et al., Journal of Molecular and Cellular Cardiology 96, 49 (2016).
• [10] K. Velten, Mathematical modeling and simulation : introduction for scientists and engineers (Wiley-VCH, Weinheim Germany, 2009).
• [11] M. Sundberg, Science, Technology, & Human Values 37, 64 (2010).
• [12] L. Geris, D. Gomez-Cabrero, eds., Uncertainty in Biology: A Computational Modeling Approach (Springer International Publishing, Switzerland, 2016).
• [13] R. N. Gutenkunst, et al., PLoS Computational Biology 3, e189 (2007).
• [14] K. S. Brown, J. P. Sethna, Physical Review E 68, 021904 (2003).
• [15] K. S. Brown, et al., Physical Biology 1, 184 (2004).
• [16] L. Geris, D. Gomez-Cabrero, Uncertainty in Biology: A Computational Modeling Approach (Springer International Publishing, 2016).
• [17] M. K. Transtrum, B. B. Machta, J. P. Sethna, Physical Review E 83, 36701 (2011).
• [18] M. P. Adams, et al., Ecology Letters 23, 607 (2020).
• [19] S. Marino, I. B. Hogue, C. J. Ray, D. E. Kirschner, Journal of Theoretical Biology 254, 178 (2008).
• [20] A. Saltelli, T. H. Andres, T. Homma, Computational Statistics & Data Analysis 15, 211 (1993).
• [21] I. M. Sobol’, Mathematics and Computers in Simulation 55, 271 (2001).
• [22] A. Saltelli, et al., Global sensitivity analysis : the primer (John Wiley & Sons, Ltd, 2008).
• [23] M. Girolami, Theoretical Computer Science 408, 4 (2008).
• [24] A. Gelman, et al., Bayesian data analysis (Chapman and Hall/CRC, New York, 2013), third edn.
• [25] D. Luengo, L. Martino, M. Bugallo, V. Elvira, S. Särkkä, EURASIP Journal on Advances in Signal Processing 2020, 25 (2020).
• [26] C. C. Drovandi, et al., Journal of The Royal Society Interface 13, 20160214 (2016).
• [27] M. K. Transtrum, et al., The Journal of Chemical Physics 143, 010901 (2015).
• [28] M. K. Transtrum, P. Qiu, Physical Review Letters 113, 098701 (2014).
• [29] A. White, et al., PLOS Computational Biology 12, e1005227 (2016).
• [30] M. K. Transtrum, A. T. Sarić, A. M. Stanković, IEEE Transactions on Power Systems 32, 2243 (2017).
• [31] D. R. Hagen, J. K. White, B. Tidor, Interface Focus 3, 20130008 (2013).
• [32] J. F. Apgar, D. K. Witmer, F. M. White, B. Tidor, Molecular BioSystems 6, 1890 (2010).
• [33] E. Dufresne, H. A. Harrington, D. V. Raman, Journal of Algebraic Statistics 9, 30 (2018).
• [34] R. N. Gutenkunst, F. P. Casey, J. J. Waterfall, C. R. Myers, J. P. Sethna, Reverse Engineering Biological Networks: Opportunities and Challenges in Computational Methods for Pathway Inference., Annals of the New York Academy of Sciences (Blackwell Publishing Inc., 2007), pp. 203–211.
• [35] P. P.-Y. Wu, M. J. Caley, G. A. Kendrick, K. McMahon, K. Mengersen, Applied Statistics 67, 417 (2018).
• [36] S. L. Choy, R. O’Leary, K. Mengersen, Ecology 90, 265 (2009).
• [37] M. Bode, et al., Methods in Ecology and Evolution 8, 1012 (2017).
• [38] C. M. Baker, et al., Ecological Applications 29, e01811 (2019).
• [39] T. J. Rothenberg, Econometrica 39, 577 (1971).
• [40] H. Hotelling, Journal of Educational Psychology 24, 417 (1933).
• [41] F. P. Casey, et al., IET systems biology 1, 190 (2007).
• [42] T. Cui, J. Martin, Y. M. Marzouk, A. Solonen, A. Spantini, Inverse Problems 30, 114015 (2014).
• [43] L. Michaelis, M. Menten, Biochem. Z 49, 333 (1913).
• [44] R. P. Pech, G. M. Hood, Journal of Applied Ecology 35, 434 (1998).
• [45] G. W. Beeler, H. Reuter, The Journal of Physiology 268, 177 (1977).
• [46] A. Cornish-Bowden, Perspectives in Science 4, 3 (2015).
• [47] G. E. Briggs, J. B. Haldane, The Biochemical journal 19, 338 (1925).
• [48] B. Choi, G. A. Rempala, J. K. Kim, Scientific Reports 7, 17018 (2017).
• [49] J. M. Tomczak, E. Wȩglarz-Tomczak, FEBS Letters 593, 2742 (2019).
• [50] T. Cui, Y. Marzouk, K. Willcox, Journal of Computational Physics 315, 363 (2016).
• [51] R. P. Pech, A. R. E. Sinclair, A. E. Newsome, P. C. Catling, Oecologia 89, 102 (1992).
• [52] T. Krogh-Madsen, D. J. Christini, eds., Modeling and Simulating Cardiac Electrical Activity (IOP Publishing, 2020).
• [53] X. Zhou, A. Bueno-Orovio, B. Rodriguez, Current Opinion in Physiology 1, 95 (2018).
• [54] O. J. Britton, et al., Proceedings of the National Academy of Science 110, E2098 (2013).
• [55] E. Passini, et al., Frontiers in Physiology 8, 668 (2017).
• [56] A. Muszkiewicz, et al., Progress in Biophysics and Molecular Biology 120, 115 (2016).
• [57] M. Zaniboni, I. Riva, F. Cacciani, M. Groppi, Mathematical Biosciences 228, 56 (2010).
• [58] C. M. Baker, A. Gordon, M. Bode, Conservation Biology 31, 376 (2017).
• [59] M. P. Adams, et al., Environmental Modelling & Software 130, 104717 (2020).
• [60] S. J. Gershman, Journal of Mathematical Psychology 71, 1 (2016).
• [61] M. K. Transtrum, P. Qiu, BMC Bioinformatics 13, 181 (2012).
• [62] C. Tönsing, J. Timmer, C. Kreutz, Physical Review E 90, 23303 (2014).
• [63] S. Kleinegesse, C. Drovandi, M. U. Gutmann, Bayesian Analysis pp. 1 – 30 (2021).
• [64] E. J. Milner-Gulland, K. Shea, Journal of Applied Ecology 54, 2063 (2017).
• [65] K. M. Banner, K. M. Irvine, T. J. Rodhouse, Methods in Ecology and Evolution 11, 882 (2020).
• [66] C. P. Robert, G. Casella, Monte Carlo Statistical Methods (Springer-Verlag, New York, 1999).
• [67] A. Doucet, S. Godsill, C. Andrieu, Statistics and Computing 10, 197 (2000).
• [68] S. A. Sisson, Y. Fan, M. A. Beaumont, Handbook of Approximate Bayesian Computation (Chapman & Hall / CRC Press, Boca Raton, Florida, 2018).
• [69] M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, L. K. Saul, Machine Learning 37, 183 (1999).
• [70] L. Tierney, J. B. Kadane, Journal of the American Statistical Association 81, 82 (1986).
• [71] I. T. Jolliffe, J. Cadima, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 374, 20150202 (2016).
• [72] A. N. Gorban, B. Kégl, D. C. Wunsch, A. Zinovyev,

Principal Manifolds for Data Visualization and Dimension Reduction

(Springer Berlin Heidelberg, Berlin, Heidelberg, 2008).
• [73] T. Cui, K. J. H. Law, Y. M. Marzouk, Journal of Computational Physics 304, 109 (2016).
• [74] K. J. Beers, Numerical Methods for Chemical Engineering: Applications in MATLAB (Cambridge University Press, Cambridge, 2006).
• [75] L. F. Price, C. C. Drovandi, A. Lee, D. J. Nott, Journal of Computational and Graphical Statistics 27, 1 (2018).
• [76] H. M. Bücker, G. Corliss, P. Hovland, U. Naumann, B. Norris, Automatic Differentiation: Applications, Theory, and Implementations, vol. 50 (Springer-Verlag Berlin Heidelberg, 2006).

Acknowledgments

Funding: This work has been supported by the Australian Research Council (ARC) Linkage Grant No LP160100496 (EM-M), the ARC Discovery Early Career Researcher Award No DE200100683 (MPA), and the National Science Foundation (NSF) Grant MCB No 1715342 (KSB).

Author contributions: All authors contributed to the conceptualization, original draft and, review & editing of the manuscript. GMM-B, BAJL, CD, KB and MPA contributed to the design of numerical experiments. GMM-B and BAJL performed numerical experiments. All authors contributed to the analysis of results.

Competing interests: The authors declare that they have no competing interests.

Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials [see works of Adams et al. [18, 59] for additional information on the posterior simulation algorithm used here]. Additional data associated with this paper may be requested from the authors.