Bayesian kernel machine causal mediation analysis

11/26/2018 ∙ by Katrina L. Devick, et al. ∙ Harvard University 0

Exposure to complex mixtures is a real-world scenario. As such, it is important to understand the mechanisms through which a mixture operates in order to reduce the burden of disease. Currently, there are few methods in the causal mediation analysis literature to estimate the direct and indirect effects of a exposure mixture on an outcome operating through a intermediate (mediator) variable. This paper presents new statistical methodology to estimate the natural direct effect (NDE), natural indirect effect (NIE), and controlled direct effects (CDEs) of a potentially complex mixture exposure on an outcome through a mediator variable. We implement Bayesian kernel machine regression (BKMR) to allow for all possible interactions and nonlinear effects of the co-exposures on the mediator, and the co-exposures and mediator on the outcome. From the posterior predictive distributions of the mediator and the outcome, we simulate counterfactual outcomes to obtain posterior samples, estimates, and credible intervals (CI) of the NDE, NIE, and CDE. We perform a simulation study that shows when the exposure-mediator and exposure-mediator-outcome relationships are complex, our proposed Bayesian kernel machine regression -- causal mediation analysis (BKMR--CMA) preforms better than current mediation methods. We apply our methodology to quantify the contribution of birth length as a mediator between in utero co-exposure of arsenic, manganese and lead, and children's neurodevelopment, in a prospective birth cohort in rural Bangladesh. We found a negative association of co-exposure to lead, arsenic, and manganese and neurodevelopment, a negative association of exposure to this metal mixture and birth length, and evidence that birth length mediates the effect of co-exposure to lead, arsenic, and manganese on children's neurodevelopment.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The ability to identify mechanisms through which a complex exposure profile operates is critical for the development of policy protective of public health. As such, the National Institute of Environmental Health Sciences (NIEHS) has prioritized the development of statistical methods that quantify the effect of environmental mixtures on health outcomes (Carlin et al., 2013). With this increased priority, new methodology is needed to measure these effects and minimize the burden of disease.

Exposure to complex mixtures are representative of real-world scenarios. Since elements of a mixture can exhibit complex interactions, it is important to consider the whole mixture when evaluating the nature of the relationship of a mixture on a health outcome (Wright et al., 2006; Claus Henn et al., 2012, 2014). Once a relationship between a mixture and outcome is established, questions regarding the pathways through which the mixture operates arise.

One approach to quantify operating mechanisms is the use of causal mediation analysis (Pearl, 2001; VanderWeele and Vansteelandt, 2009, 2010; Valeri and VanderWeele, 2013). Causal mediation analysis allows for the decomposition of a total effect (TE) of an exposure on an outcome into the pathway that operates indirectly through an intermediate (mediator) variable and the pathway that is independent of the intermediate variable or that operates directly from the exposure to outcome. Researchers’ understanding of the pathways operating through an intermediate variable is crucial for policy recommendations to reduce the harmful impact of environmental mixtures on health outcomes. Few methods exist to estimate mediation effects when the exposure of interest is a mixture. If the mediator variable has a linear effect on the outcome, a direct extension of the closed form solutions derived by VanderWeele and Vansteelandt (2009) and Valeri and VanderWeele (2013) is applicable. In the presence of a nonlinear effect of the mediator on the outcome, the algorithm presented by Imai et al. (2010)

can be used to estimate the natural direct effect (NDE), natural indirect effect (NIE), and controlled direct effects (CDE) of a mediator on the relationship of a mixture on a outcome, through prediction of counterfactuals. However, both of these methods assume no model misspecification. Thus, all interactions between the individual elements of the mixture, the elements of the exposure mixture and mediator, and any nonlinearities need to be included in the models for the mediator and outcome to obtain valid inference. As the dimensions of a multi-dimensional exposure increase, it becomes exponentially difficult to use current methods to obtain unbiased estimates of the mediated effects. To our knowledge, no other methods currently exist to estimate the NDE, NIE, and CDE of a potentially complex exposure mixture on an outcome through a mediator variable.

In this paper, we present a novel method to estimate the NDE, NIE, and CDE for a potentially complex mixture of exposures on an outcome operating through an intermediate variable. We allow for highly complex exposure-mediator and exposure-response functions using Bayesian kernel machine regression (BKMR). BKMR has been shown to perform well compared to other kernel machine approaches (Bobb et al., 2015). We use BKMR to model the mediator and outcome since BKMR allows for all possible nonlinearities and interactions between the mixture elements, and between the mixture and the mediator, without a priori specification. We predict counterfactuals using the posterior predictive distributions of the mediator and the outcome and present an algorithm for estimation of mediation effects.

We apply this method to model data from a prospective birth cohort in Bangladesh. Arsenic, manganese, and lead are known neurotoxicants (Bressler et al., 1999; Clarkson, 1987; Polańska et al., 2013; Vahter, 2008; Zoni and Lucchini, 2013) that are abundant in the Bangladeshi environment, including in drinking water (Kile et al., 2009). The relationship between arsenic, manganese and lead on child neurodevelopment is shown to be complex (Wasserman et al., 2004, 2006, 2007, 2008; Wright et al., 2006; Claus Henn et al., 2010, 2012, 2014; Hamadani et al., 2011; Valeri et al., 2017). In our data application, we estimate the NDE and NIE to bring light to the relationship of this metal mixture on child neurodevelopment operating through in utero

growth, specifically birth length. Also, we estimate the CDE of the metal mixture on neurodevelopment at different quantiles of birth length to assess if birth growth measures can potentially block some of the harmful effects of the metal mixture on neurodevelopment.

2 Materials and Methods

2.1 Bayesian kernel machine regression

We first review BKMR presented by Bobb et al. (2015) as a framework to estimate the effect of a complex mixture on a health outcome. For each subject , we assume:

(1)

where is a continuous health outcome,

is a vector of

exposure variables (e.g. metals), is a vector of potential confounders, and . Model (1) relates the outcome to the exposure mixture through a flexible function, , which accommodates for nonlinearity and/or interaction among the mixture components. Due to this complexity, identifying a set of basis functions to represent can be difficult, thus, we employ a kernel machine representation (Cristianini and Shawe-Taylor, 2000).

The unknown function can be specified in two ways. One can either use basis functions or a positive-definite kernel function to identify . Mercer’s theorem (Cristianini and Shawe-Taylor, 2000) established that the kernel function implicitly specifies a unique function space spanned by a particular set of orthogonal basis functions, under regularity conditions. Therefore, any in this function space can be represented by a set of basis functions or by the dual representation kernel function . Liu et al. (2007) showed that model (1) can be expressed as the mixed model (2):

(2)

where K, the kernel matrix, has -th element .

The kernel function uses a metric of similarity to establish how close exposure profiles and are for subjects and . We will focus on the Gaussian kernel, which uses Euclidean distance as a means to quantify this similarity. Under the Gaussian kernel, we assume , where is a tuning parameter that regulates the smoothness of the dose-response function. Intuitively, this assumption means subjects with similar exposure profiles ( close to ) will have more similar risks ( will be close to ).

To fit (1), we assume a flat prior on the coefficients for the confounding variables, , and assume , were we set both the shape parameter and the scale parameter to 0.001. For convenience, we parameterize BKMR model (1) with , where we assume a Gamma prior distribution for with mean

and variance

. We assume a uniform distribution

with and for the smoothness parameter . For additional details regarding BKMR and prior specification, see Bobb et al. (Bobb et al., 2015).

When the exposure mixture is comprised of numerous elements, it may be of interest to fit (1) with component-wise variable selection. To allow for variable selection, the kernel function is augmented. In the case of the Gaussian kernel, the kernel function is expanded as:

(3)

where , and we assume a “slab-and-spike” prior for the auxiliary parameters,

(4)

where is an indicator that element is included in the kernel, denotes a pdf with support on , and is the density with a point mass at 0.

2.2 Causal mediation analysis

In order to define causal contrasts in a mediation context, we first define our notation. Let denote the counterfactual outcome if the exposure level was set to and mediator level was set to . Let be the counterfactual mediator level that would have been observed if the exposure was set to . Accordingly, represents the counterfactual outcome if the exposure level was set to and the mediator was set to the level it would have taken if the exposure level was .

The mediated effects of interest, the natural direct effect (NDE), the natural indirect effect (NIE), and the controlled direct effects (CDEs), are formally defined as:

(5)
(6)
(7)

The NDE captures the average difference in the counterfactual outcomes for a change in exposure level to , while fixing the mediator to the level it would have taken if unexposed, . The NIE measures the average difference in counterfactual outcomes when fixing the exposure to level , while the mediator varies from the level it would have taken if exposed to to the level it would have taken if unexposed, . The CDE quantifies the average difference in the counterfactual outcomes for a change in exposure level from to , while intervening to fix the mediator to a specified level, .

2.3 Bayesian kernel machine regression – causal mediation analysis

We consider a single health outcome , single mediator variable , exposure mixture A comprised of components, and confounder matrix C. To allow for potentially complex relationships between the mixture elements, we model the mediator variable using BMKR model (8):

(8)

where . Since accounting for exposure-mediator interactions is important to obtain unbiased effect estimates, we include the mediator variable along with the exposure mixture in the kernel function when modeling the health outcome in (9):

(9)

where . By fitting the models separately, we assume and are independent. To model

the total effect of the exposure mixture on the outcome, we consider BKMR model (10):

(10)

where .

We estimate the NDE, NIE, and TE for a change in exposure profile from to a via the following algorithm.

  1. Fit BKMR mediator, outcome, and total effect models (8), (9), and (10), respectively.

  2. For each Markov chain Monte Carlo (MCMC) iteration,

    :

    1. Sample replicates of the mediator for the mean level of covariates under exposure level from mediator model (8):

    2. For each of the and samples of , estimate the average outcome value for the mean level of covariates for from outcome model (9):

    3. Let the posterior sample of be

    4. Since and , we sample the posterior sample of and from the total effect model instead of sampling and for ease of computation. Calculate the average outcome value for the mean level of covariates at from (10):

  3. Obtain the posterior sample of the NDE, NIE, and TE by:

  4. Estimate the NDE, NIE and 95% credible intervals from these posterior samples.

Four no unmeasured confounding assumptions are required for the NDE and NIE to have a causal interpretation: (i) , (ii) , (iii) , and (iv) . Namely, there are no unmeasured exposure-outcome confounders, there are no unmeasured mediator-outcome confounders, there are no unmeasured exposure-mediator confounders, and the exposure does not affect any mediator-outcome confounders.

To estimate the CDE, only two no unmeasured confounding assumptions are required: (i) and (ii). The algorithm to estimate the CDE is similar to the algorithm presented above. We include explicit steps to estimate the CDE in Appendix A.

3 Simulation Study

We evaluated the ability of Bayesian kernel machine regression – causal mediation analysis (BKMR-CMA) to estimate the joint mediated effects of a mixture compared to traditional mediation methods (Baron and Kenny, 1986) and causal mediation analysis (VanderWeele and Vansteelandt, 2009, 2010; Valeri and VanderWeele, 2013) under numerous signal-to-noise ratios motivated from our Bangladesh application (Section 4).

3.1 Setup

We first generated a true underlying dataset for each simulation scenario. The dataset consisted of a health outcome , a mediator value , and an exposure mixture for subjects. The exposure mixture was generated as , the mediator as , and the health outcome as , where is the covariance structure of in utero exposure to manganese (Mn), arsenic (As), and lead (Pb) after log transform and standardization from our Bangladesh application (Section 4), and and are the dose-response surfaces observed in Bangladesh. The correlation structure of Mn, As, and Pb is depicted in Figure 1 and graphical summaries of and are included in Figure 2. We set and to various signal-to-noise ratios motivated from Bangladesh where we took the estimated residual variances after fitting BKMR models (8) and (9), and respectively, and divided them by , and . We took this approach instead of increasing the sample size to decrease the computational burden of our simulations. For each simulation scenario, we randomly sampled 500 datasets of 300 observations each, , from the true underlying dataset of 50,000 subjects.

For each simulation dataset, we first fit BKMR models (11) - (13) with and without component-wise variable selection:

(11)
(12)
(13)

where , , and . Using our proposed BKMR–CMA approach, we estimated the NDE, NIE, and TE for a change of the exposure mixture from , the exposures set equal to their percentile in the true underlying dataset, to the a, the exposures set equal to their percentile in the true underlying dataset. Since the exposures were generated from a multivariate normal with mean 0 and variance 1, all elements in the vectors were extremely close to and all elements in the a vectors were extremely close to . We estimated the CDE for the same change in the exposures, from to a, fixing the mediator at the observed , , and percentiles of the mediator in the corresponding true underling dataset.

Second, we conducted meditation analyses for the joint effect of the metal mixture using both causal mediation methods and traditional approaches. For these analyses, we considered the same change in exposures as with BKMR–CMA, a change in the metals from their to percentiles in the underlying data for the NDE, NIE, TE, and CDE. For the CDE, we fixed the mediator to its , , and

percentiles in the corresponding true underling dataset. We modeled the mediator and outcome using linear regression allowing for exposure-mediator interactions in the outcome model. Derivations of the closed form solutions to estimate causal mediation effects with multiple exposures are included in Appendix

B. We refer to the approach that uses linear regression models for both the mediator and outcome and allows for exposure-mediator interactions in the outcome model as the “linear approach” (VanderWeele and Vansteelandt, 2009, 2010; Valeri and VanderWeele, 2013). We also estimated the NDE and NIE without considering exposure-mediator interactions using the product method extended for multiple exposures, we refer to this as the “traditional approach” (Baron and Kenny, 1986).

3.2 Simulation results

The root mean squared error (rMSE) and coverage probabilities for the TE, NDE, NIE, and CDEs in our simulation are summarized in Table

1. We observe similar results for the TE as we do for the CDEs when we intervene to fix the mediator at its 25, 50, and 75 percentiles in the true underlying datasets. As the variance in the data generation decreased, we see a decrease in the rMSE across all effects and approaches considered. Although the rMSE is about the same for the TE and CDEs using our proposed BKMR–CMA approach compared to the linear approach at lower signal-to-noise ratios, there is a noticeable gap in the rMSE at higher signal-to-noise ratios. Specifically for the TE, we see that the rMSE has approached 0.019 for the linear approach, but the rMSE for our BKMR–CMA approach both when the models are fit with and without variable selection is still decreasing at the higher signal-to-noise ratios we considered. We also observe significant differences in the coverage probabilities for the TE and CDEs comparing the linear approach to our BKMR–CMA approach. While our BKMR–CMA approach is conservative, the coverage for the linear approach is very bad, most notable for the TE when the signal-to-noise ratio is high the coverage probability is as low as 0.22.

For the NDE, the rMSE is similar for the linear and our BKMR–CMA approach, and slightly higher for the traditional approach. However, the coverage probabilities of the NDE for both the linear and traditional approaches are lower than acceptable. The rMSE of the NIE is lower for the linear and traditional approaches than our BKMR–CMA approach. This results from the fact that the TE decomposes into the NDE and NIE, thus, the bias of the NIE is function of the magnitude and direction of the biases for the TE and NDE. Although the biases for the TE and NDE are greater in magnitude for the linear approach compared to our BKMR–CMA approach, the are of similar magnitude and operate in the same direction for the linear approach opposed to our BKMR–CMA approach. Therefore, the biases cancel each other out for linear and traditional approaches and the rMSE for the NIE is observed to be lower.

4 Data analysis

4.1 Study population

We apply our BKMR-CMA methodology to quantify the contribution of birth length (BL) as a mediator between in utero co-exposure to arsenic, manganese and lead, and children’s neurodevelopment, in a prospective birth cohort in rural Bangladesh. This cohort has previously been described (Gleason et al., 2014; Kile et al., 2014; Valeri et al., 2017). For the purpose of illustrating our method, we only include mother-infant pairs enrolled at the Pabna clinic and exclude 5 pairs where the infant had outlying birth lengths (BL

3.7 standard deviations (SD) from the mean), for a total sample of

. Researchers measured in utero metal exposure to arsenic, manganese, and lead from umbilical cord venous blood samples. Collaborators in Bangladesh administered the Bayley Scales of Infant and Toddler DevelopmentTM, Third Edition (BSID-IIITM) to children 20-40 months after birth and neurodevelopment was measured as the raw cognitive development score (CS) (Bayley, 2006). We control for child sex, child’s age at the time of the Bayley Scale administration, maternal IQ, maternal education (less than high school vs. at least high school), maternal protein intake (low vs. medium vs. high tertiles), secondhand smoke exposure at baseline (smoking environment vs. non-smoking environment), HOME score, and maternal age at delivery in all analyses. When conducting our analyses, we log transformed, centered, and scaled metal concentrations, and centered and scaled CS, BL, and continuous confounder variables.

4.2 Models

We model the effect of co-exposure to arsenic, manganese, and lead on birth length via a BKMR mediator model (8). To model the joint effect of the metal mixture and birth length on neurodevelopment, we fit a BKMR outcome model (9) with all three metals and birth length in the kernel function. We graphically examine the relationship between the metal mixture and birth length and the relationship between the metal mixture and birth length on the neurodevelopment.

We simulate counterfactuals to estimate the NDE, NIE, and TE for a change in the raw exposures from g/dL, g/dL, g/dL), all metals set at their corresponding 25 percentile, to g/dL, g/dL, g/dL), all metals set at their 75 percentile. We also calculate the CDE for a change in exposure from to a when birth length is set to its 25 percentile of 46cm, median value of 47cm, and 75 percentile of 48cm.

To further examine which metals mediate the effect on neurodevelopment through birth length, we calculate the NDE and NIE for a change of a single metal from its 25 to 75 percentiles, while fixing the other metals at their 25, 50, or 75 percentile values.

4.3 Results

We found nonlinear associations of the metals and birth length, and of manganese with neurodevelopment, and an interaction between arsenic and manganese on neurodevelopment. There also appears to be small interactions between the metals and birth length on neurodevelopment. Graphical summaries of BKMR mediator, outcome, and total effect models are included in Figures 2 and 3.

Figure 3D suggests that the metal mixture has a harmful effect on child neurodevelopment when comparing higher percentiles to lower percentiles of metal exposure. A change in the metal mixture from the median to higher quantiles operates by significantly reducing birth length as seen in Figure 3E. We see in Figure 3A that manganese is responsible for the majority of the harmful effect of the metal mixture on neurodevelopment and in Figure 3B the significant inverse effect of the metal mixture on birth length also is driven by manganese. After adjustment for birth length and intervening to fix birth length at its median value of 47cm, the effect of the metal mixture on neurodevelopment is reduced (Figure 3F compared to Figure 3D).

Table 2 summarizes the mediation effects for a change of the raw metal mixture from g/dL, g/dL, g/dL) to g/dL, g/dL, g/dL). We see a negative association between the metal mixture and neurodevelopment, independent of birth length, comparing the co-exposure of metals at their 75th percentiles to their 25th percentiles, although not significant (NDE: -0.04, 95% CI: (-0.27, 0.18)). The percent mediated through birth length was estimated to be 73%, and a negative indirect effect of birth length was observed (NIE: -0.10, 95% CI: (-0.34, 0.14)). Upon hypothetical intervention to fix birth length at the 75th percentile value of 48cm, the direct effect was reduced, suggesting, targeted interventions on fetal growth can block part of the adverse effect of metals on neurodevelopment (CDE: -0.02, 95% CI: (-0.22, 0.17)). Due to the flexibility of our BKMR–CMA, we require greater sample sizes to detect mediation effects. In our Bangladeshi cohort, we are not even powered to estimate the TE using BKMR, so although the mediation effects are not significant, the trends observed in the data are informative.

In Figure 4, we observe the negative association between the metal mixture and neurodevelopment is driven by manganese, both indirectly through birth length and direct. When manganese and arsenic are set to higher levels, lead is mediating some of effect of the metal mixture on neurodevelopment.

5 Discussion

We have proposed Bayesian kernel machine regression – causal mediation analysis (BKMR–CMA) as a way to estimate the direct and indirect effects of an environmental mixture on a outcome through an intermediate variable. To our knowledge, this is the first method presented in the causal inference literature to estimate these effects when the exposure of interest is a potentially complex mixture, without a priori knowledge of the exposure-mediator or exposure-mediator-outcome relationship. This method allows for complex relationships between the elements of the mixture and the mediator variable through the joint kernel specification in the outcome model. Our extension of causal mediation methodology that allows for a mixture of exposures is important for many environmental health applications.

We estimate the TE, NDE, NIE, and CDEs through simulation of counterfactuals from the posterior predictive distribution for each MCMC iteration and make inference from these posterior samples of the mediation effects. Our simulation showed our proposed BKMR–CMA approach performs better than current methods to estimate the TE, NDE, and CDEs. We observe noticeable differences in the coverage probability for the linear and traditional approaches compared to our BKMR–CMA approach. In the presence of complex data generation scenarios, we advise to use our approach over other methods.

Applying these methods to a prospective Bangladeshi birth cohort, we found a negative association of co-exposure to lead, arsenic, and manganese on neurodevelopment, a negative association of exposure to this metal mixture on birth length, and some evidence that birth length mediates the effect of co-exposure to lead, arsenic, and manganese on children’s neurodevelopment. If birth length were fixed to its percentile value of 48cm, the effect of the metal mixture on neurodevelopment is smaller, suggesting that nutritional interventions to help increase birth length could potential block the harmful effect of the metal mixture.

Our BKMR–CMA algorithm easily extends beyond linear models for the mediator and outcome. If the outcome is binary, the logistic regression option in the

R package can be used and our code be implemented to estimate the mediation effects. While one can also consider variable selection with BKMR–CMA for high dimensional exposures, we did not apply it to our Bangladeshi cohort because we were interested in the effect of three metals, so high dimensionality is not a primary concern here. The general approach we present can be used to estimate mediation effects for any Bayesian mediator and outcome models.

Many limitations of our method are due to the exponentially increasing computation time required to fit BKMR and predict counterfactuals as the number of exposures, covariates, and sample size increase. In many applications, exposure to mixtures with more than three elements is common. In the presence of a high dimensional exposure, simulations studies would need be conducted to see how our method preforms. In the current formulation of our algorithm, we assume the mediator and outcome models are independent. Although this is a common assumption in causal mediation literature, this is a limitation of our methods. In our data application, our results are limited by potential residual confounding by malnutrition and low power due to a small sample size.

In future work, we plan to consider joint specification of the mediator and outcome models to reduce the assumptions need for BKMR–CMA to be interpreted causally. We also hope to extend these methods to allow for multiple mediators and/or multiple outcomes.

References

  • Baron and Kenny (1986) Baron, R. M. and Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6):1173–1182.
  • Bayley (2006) Bayley, N. (2006). Bayley Scales of Infant and Toddler Development. Harcourt Assessment Inc., San Antonio, TX, 3rd edition.
  • Bobb et al. (2015) Bobb, J. F., Valeri, L., Claus Henn, B., Christiani, D. C., Wright, R. O., Mazumdar, M., Godleski, J. J., and Coull, B. A. (2015). Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures. Biostatistics, 16(3):493–508.
  • Bressler et al. (1999) Bressler, J., Kim, K.-a., Chakraborti, T., and Goldstein, G. (1999). Molecular mechanisms of lead neurotoxicity. Neurochemical Research, 24(4):595–600.
  • Carlin et al. (2013) Carlin, D. J., Rider, C. V., Woychik, R., and Birnbaum, L. S. (2013). Unraveling the health effects of environmental mixtures: an NIEHS priority. Environmental Health Perspectives, 121(1):A6–A8.
  • Clarkson (1987) Clarkson, T. W. (1987). Metal toxicity in the central nervous system. Environmental Health Perspectives, 75:59–64.
  • Claus Henn et al. (2014) Claus Henn, B., Coull, B. A., and Wright, R. O. (2014). Chemical mixtures and children’s health. Current Opinion in Pediatrics, 26(2):223–229.
  • Claus Henn et al. (2010) Claus Henn, B., Ettinger, A. S., Schwartz, J., Téllez-Rojo, M. M., Lamadrid-Figueroa, H., Hernández-Avila, M., Schnaas, L., Amarasiriwardena, C., Bellinger, D. C., Hu, H., et al. (2010). Early postnatal blood manganese levels and children’s neurodevelopment. Epidemiology (Cambridge, Mass.), 21(4):433–439.
  • Claus Henn et al. (2012) Claus Henn, B., Schnaas, L., Ettinger, A. S., Schwartz, J., Lamadrid-Figueroa, H., Hernández-Avila, M., Amarasiriwardena, C., Hu, H., Bellinger, D. C., Wright, R. O., et al. (2012). Associations of early childhood manganese and lead coexposure with neurodevelopment. Environmental Health Perspectives, 120(1):126–132.
  • Cristianini and Shawe-Taylor (2000) Cristianini, N. and Shawe-Taylor, J. (2000).

    An introduction to support vector machines and other kernel-based learning methods

    .
    Cambridge University Press.
  • Gleason et al. (2014) Gleason, K., Shine, J. P., Shobnam, N., Rokoff, L. B., Suchanda, H. S., Hasan, I., Sharif, M. O., Mostofa, G., Amarasiriwardena, C., Quamruzzaman, Q., et al. (2014). Contaminated turmeric is a potential source of lead exposure for children in rural Bangladesh. Journal of Environmental and Public Health, 2014.
  • Hamadani et al. (2011) Hamadani, J., Tofail, F., Nermell, B., Gardner, R., Shiraji, S., Bottai, M., Arifeen, S., Huda, S. N., and Vahter, M. (2011). Critical windows of exposure for arsenic-associated impairment of cognitive function in pre-school girls and boys: a population-based cohort study. International Journal of Epidemiology, 40(6):1593–1604.
  • Imai et al. (2010) Imai, K., Keele, L., and Tingley, D. (2010). A general approach to causal mediation analysis. Psychological Methods, 15(4):309–334.
  • Kile et al. (2009) Kile, M., Wright, R., Amarasiriwardena, C., Quamruzzaman, Q., Rahman, M., Mahiuddin, G., and Christiani, D. (2009). Maternal and umbilical cord blood levels of arsenic, cadmium, manganese, and lead in rural Bangladesh. Epidemiology, 20(6):S149–S150.
  • Kile et al. (2014) Kile, M. L., Rodrigues, E. G., Mazumdar, M., Dobson, C. B., Diao, N., Golam, M., Quamruzzaman, Q., Rahman, M., and Christiani, D. C. (2014). A prospective cohort study of the association between drinking water arsenic exposure and self-reported maternal health symptoms during pregnancy in Bangladesh. Environmental Health, 13(1):29.
  • Liu et al. (2007) Liu, D., Lin, X., and Ghosh, D. (2007).

    Semiparametric regression of multidimensional genetic pathway data: Least-squares kernel machines and linear mixed models.

    Biometrics, 63(4):1079–1088.
  • Pearl (2001) Pearl, J. (2001). Direct and indirect effects. In

    Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence

    , UAI’01, pages 411–420, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
  • Polańska et al. (2013) Polańska, K., Jurewicz, J., and Hanke, W. (2013). Review of current evidence on the impact of pesticides, polychlorinated biphenyls and selected metals on attention deficit/hyperactivity disorder in children. International Journal of Occupational Medicine and Environmental Health, 26(1):16–38.
  • Vahter (2008) Vahter, M. (2008). Health effects of early life exposure to arsenic. Basic & Clinical Pharmacology & Toxicology, 102(2):204–211.
  • Valeri et al. (2017) Valeri, L., Mazumdar, M. M., Bobb, J. F., Claus Henn, B., Rodrigues, E., Sharif, O. I., Kile, M. L., Quamruzzaman, Q., Afroz, S., Golam, M., et al. (2017). The joint effect of prenatal exposure to metal mixtures on neurodevelopmental outcomes at 20–40 months of age: Evidence from rural Bangladesh. Environmental Health Perspectives, 125(6):067015.
  • Valeri and VanderWeele (2013) Valeri, L. and VanderWeele, T. J. (2013). Mediation analysis allowing for exposure–mediator interactions and causal interpretation: Theoretical assumptions and implementation with SAS and SPSS macros. Psychological Methods, 18(2):137–150.
  • VanderWeele and Vansteelandt (2009) VanderWeele, T. J. and Vansteelandt, S. (2009). Conceptual issues concerning mediation, interventions and composition. Statistics and its Interface, 2(4):457–468.
  • VanderWeele and Vansteelandt (2010) VanderWeele, T. J. and Vansteelandt, S. (2010). Odds ratios for mediation analysis for a dichotomous outcome. American Journal of Epidemiology, 172(12):1339–1348.
  • Wasserman et al. (2008) Wasserman, G. A., Liu, X., Factor-Litvak, P., Gardner, J. M., and Graziano, J. H. (2008). Developmental impacts of heavy metals and undernutrition. Basic & Clinical Pharmacology & Toxicology, 102(2):212–217.
  • Wasserman et al. (2007) Wasserman, G. A., Liu, X., Parvez, F., Ahsan, H., Factor-Litvak, P., Kline, J., Van Geen, A., Slavkovich, V., LoIacono, N. J., Levy, D., et al. (2007). Water arsenic exposure and intellectual function in 6-year-old children in Araihazar, Bangladesh. Environmental Health Perspectives, 115(2):285–289.
  • Wasserman et al. (2004) Wasserman, G. A., Liu, X., Parvez, F., Ahsan, H., Factor-Litvak, P., van Geen, A., Slavkovich, V., Lolacono, N. J., Cheng, Z., Hussain, I., et al. (2004). Water arsenic exposure and children’s intellectual function in Araihazar, Bangladesh. Environmental Health Perspectives, 112(13):1329–1333.
  • Wasserman et al. (2006) Wasserman, G. A., Liu, X., Parvez, F., Ahsan, H., Levy, D., Factor-Litvak, P., Kline, J., van Geen, A., Slavkovich, V., LoIacono, N. J., et al. (2006). Water manganese exposure and children’s intellectual function in Araihazar, Bangladesh. Environmental Health Perspectives, 114(1):124–129.
  • Wright et al. (2006) Wright, R. O., Amarasiriwardena, C., Woolf, A. D., Jim, R., and Bellinger, D. C. (2006). Neuropsychological correlates of hair arsenic, manganese, and cadmium levels in school-age children residing near a hazardous waste site. Neurotoxicology, 27(2):210–216.
  • Zoni and Lucchini (2013) Zoni, S. and Lucchini, R. G. (2013). Manganese exposure: cognitive, motor and behavioral effects on children: a review of recent findings. Current Opinion in Pediatrics, 25(2):255–260.

Appendix A Algorithm to estimate CDEs using BKMR

  1. Fit BKMR outcome model (9).

  2. For each MCMC iteration, :

    1. Estimate the average outcome value for the mean level of covariates at the specific mediator value of interest for from (9). (I.e. estimate and for each MCMC iteration).

    2. Obtain the posterior sample of the CDE for a change of exposure from to a intervening to fix the mediator at by:

  3. Estimate the CDE and 95% credible intervals from these posterior samples.

Only two no unmeasured confounding assumptions are required for the CDE to have a causal interpretation: , . Namely, there are no unmeasured exposure-outcome confounders and there are no unmeasured mediator-outcome confounders.

Appendix B Formulas to estimate causal mediation effects when the exposure is a mixture

Consider the following linear regression models for the mediator and outcome:

(B.1)
(B.2)

where is a exposure mixture of components, , , and .

represents the expected outcome value had everyone been exposed to level a and had their mediator been set to level , fixing covariates to level c. represents the expected outcome value had everyone been exposed to level a and had their mediator been set to the level it would have taken if exposure is set to , fixing covariates to level c. Then, considering models (B.1) and (B.2), and assuming (i) , (ii) , (iii) , and (iv) , we can estimate these effects as:

By similar logic,

Thus,

When considering traditional approaches to model the outcome, we do not include exposure-mediator interactions in (B.2). We therefore model the outcome as:

(B.3)

We estimate the traditional mediation effects for an exposure mixture as:

Figure 1: Covariance structure considered in our simulation. The covariance for manganese (Mn), arsenic (As), and lead (Pb) from Bangladesh after log transform and standardization.
Figure 2: Kernel functions and considered in our simulation. These functions are obtain by fitting BKMR mediator model (8) and BKMR outcome model (9) in our Bangladeshi cohort. Bivariate (A) exposure-mediator and (B) exposure-mediator-response functions for the each element listed on the top when the element listed on the right is fixed at its 25, 50 or percentiles.
rMSE Coverage Probability
Effect Models /1 /2 /5 /10 /20 /50 /1 /2 /5 /10 /20 /50
TE BKMR 0.043 0.033 0.023 0.017 0.013 0.009 0.964 0.972 0.968 0.962 0.958 0.960
BKMR-VS 0.048 0.039 0.023 0.018 0.014 0.009 0.922 0.898 0.928 0.930 0.932 0.952
linear 0.042 0.032 0.024 0.021 0.020 0.019 0.930 0.912 0.834 0.716 0.518 0.222
NDE BKMR 0.038 0.029 0.021 0.017 0.015 0.013 1.000 1.000 1.000 1.000 1.000 0.990
BKMR-VS 0.038 0.031 0.021 0.017 0.014 0.013 1.000 1.000 1.000 1.000 1.000 0.990
linear 0.040 0.030 0.021 0.018 0.015 0.014 0.948 0.928 0.904 0.888 0.854 0.782
traditional 0.039 0.029 0.021 0.019 0.018 0.019 0.946 0.926 0.882 0.812 0.720 0.498
NIE BKMR 0.026 0.020 0.018 0.017 0.016 0.014 1.000 1.000 1.000 1.000 1.000 0.994
BKMR-VS 0.026 0.021 0.015 0.015 0.015 0.014 1.000 1.000 1.000 1.000 1.000 0.994
linear 0.014 0.013 0.012 0.011 0.011 0.010 0.954 0.926 0.910 0.898 0.884 0.874
traditional 0.013 0.012 0.010 0.009 0.008 0.006 0.944 0.932 0.934 0.928 0.934 0.956
CDE25 BKMR 0.044 0.032 0.022 0.017 0.013 0.011 0.984 0.980 0.980 0.978 0.972 0.980
BKMR-VS 0.042 0.038 0.031 0.023 0.016 0.012 0.926 0.876 0.866 0.910 0.932 0.964
linear 0.050 0.038 0.029 0.026 0.025 0.025 0.936 0.918 0.812 0.706 0.552 0.322
CDE50 BKMR 0.042 0.031 0.021 0.016 0.013 0.010 0.976 0.976 0.970 0.966 0.958 0.960
BKMR-VS 0.039 0.036 0.030 0.022 0.015 0.011 0.902 0.848 0.850 0.900 0.928 0.932
linear 0.042 0.032 0.024 0.020 0.019 0.019 0.918 0.898 0.850 0.782 0.686 0.468
CDE75 BKMR 0.044 0.033 0.022 0.017 0.013 0.010 0.980 0.976 0.972 0.966 0.966 0.956
BKMR-VS 0.033 0.031 0.028 0.021 0.015 0.012 0.936 0.892 0.878 0.900 0.938 0.936
linear 0.048 0.034 0.023 0.018 0.015 0.014 0.952 0.938 0.934 0.916 0.904 0.802
Table 1: Root mean squared error (rMSE) and coverage probabilities from our simulations under varying signal-to-noise ratios comparing our BKMR–CMA approach using BKMR models fit with and without variable selection (BKMR-VS and BKMR respectively), the linear approach, and the traditional approach. CDE25 represents the CDE when the mediator is set to its 25 percentile, CDE50 represents the CDE when the mediator is set to its 50 percentile, and CDE75 represents the CDE when the mediator is set to it 75 percentile in the true underlying dataset.
Figure 3: The joint effect of the metal mixture on cognitive score (CS) in our Bangladeshi cohort, adjusted and unadjusted for birth length, and the joint effect of the metal mixture on birth length estimated by Bayesian kernel machine regression (BKMR). (A, C) Overall effect of the metal mixture on cognitive score adjusted and unadjusted for birth length (estimates and 95% credible intervals). These figures shows the average change in neurodevelopment for a change in each element of the metal mixture from a particular percentile to the their median value, from the total effects model (unadjusted for birth length) and the outcome model (fixing birth length at its median value of 47cm). (B) This plot shows the expected change in birth length for a change in the metal mixture described for A/C. (D, F) Single metal associations with neurodevelopment (estimates and 95% CI, gray dashed line a the null). These figures show the average change in neurodevelopment (adjusted and unadjusted for birth length) for a change in a single metal from is 25 to 75 percentile values, fixing the other metals at their 25, 50, or 75 percentiles. (E) Single metal associations with birth length.
Effect Estimate (95% CI)
TE -0.14 (-0.36, 0.03)
NDE -0.04 (-0.27, 0.18)
NIE -0.10 (-0.34, 0.14)
CDE(=46cm) -0.06 (-0.28, 0.12)
CDE(=47cm) -0.04 (-0.25, 0.13)
CDE(=48cm) -0.02 (-0.22, 0.17)
Table 2: Mediation effects estimated in Bangladesh using BKMR–CMA. All effects are estimated for a change of the mixture (As, Mn, Pb) from its raw percentile g/dL, g/dL, g/dL) to its raw percentile g/dL, g/dL, g/dL). The CDE are calculated as the direct effect from to a intervening to fix the mediator at its , , percentiles values of 46, 47, and 48cm respectively.
Figure 4: Single metal NDE and NIE effects on neurodevelopment (estimates and 95% CI). These figures show the NDE and NIE for a change in one metal from its 25 to 75 percentile values, fixing the other metals at their 25, 50, or 75 percentiles.