Kernel Machine and Distributed Lag Models for Assessing Windows of Susceptibility to Mixtures of Time-Varying Environmental Exposures in Children's Health Studies

04/29/2019
by   Ander Wilson, et al.
Colorado State University
0

Research has shown that early life exposures to environmental chemicals, starting as early as conception, can reprogram developmental trajectories to result in altered health status later in life. These principles likely apply to complex mixtures as well as individual chemicals. We thus consider statistical methods to estimate the association between mixtures of multiple time-varying exposures and a future health outcome, e.g. exposure to multiple air pollutants observed weekly throughout pregnancy and birth weight. First, we illustrate how to use traditional distributed lag models, distributed lag nonlinear models, and Bayesian kernel machine regression to estimate the association between multiple time-varying exposures and a health outcome. While none of these methods simultaneously account for exposure-timing, nonlinear association, and interactions, we highlight situations in which each model performs well. Second, we propose a new method to estimate the association between multiple time-varying exposures and a health outcome. The proposed approach is, to our knowledge, the first method to simultaneously account for exposure-timing, nonlinear associations, and interactions between time-varying exposures. The proposed approach is a Bayesian kernel machine regression method that accounts for exposure timing using a functional weight component within the kernel. The weight function identifies developmental periods with increased association between exposure and a future health outcome, often referred to as a window of susceptibility. We demonstrate the proposed methods in an analysis of exposure to four ambient pollutants and birth weight in a Boston-area perinatal cohort.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

07/30/2021

Multiple exposure distributed lag models with variable selection

Distributed lag models are useful in environmental epidemiology as they ...
04/20/2021

Critical Window Variable Selection for Mixtures: Estimating the Impact of Multiple Air Pollutants on Stillbirth

Understanding the role of time-varying pollution mixtures on human healt...
10/13/2020

Treed distributed lag non-linear models

In studies of maternal exposure to air pollution a children's health out...
08/27/2020

Joint modeling of time-varying exposure history and subsequent health outcomes: identification of critical windows

Long-term behavioral and health risk factors constitute a primary focus ...
11/26/2018

Bayesian kernel machine causal mediation analysis

Exposure to complex mixtures is a real-world scenario. As such, it is im...
10/20/2021

Machine learning approaches to identify thresholds in a heat-health warning system context

During the last two decades, a number of countries or cities established...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Humans are inevitably exposed to a complex mixture of chemicals and other pollutants throughout the life course beginning with conception (Woodruff et al., 2011; Wright, 2017). Epidemiological evidence about the toxicity of environmental chemicals has traditionally come from studies of exposure observed during a single time window, such as the same day as a health outcome is assessed or averaged over a preceding time period determined a priori rather than data-driven. The one-chemical-at-a-time and one-exposure-window approaches can result in misleading estimates by failing to distinguish between the effects of multiple highly correlated chemical exposures (Braun et al., 2016) or by incorrectly identifying the time window during which someone is vulnerable to a chemical exposure (Wilson et al., 2017b). As such, recent research has focused on statistical approaches to quantify the health risks associated with time-varying exposures and with exposure to mixtures of multiple environmental chemicals.

In the study of the risks associated with maternal exposures to air pollution and children’s health, there is particular interest in exposure timing. There exist windows of susceptibility in time during which the developing fetus has increased vulnerability to chemical exposures. These windows correspond to specific developmental processes that may be altered by environmental insults and likely vary depending on the mechanisms of the chemical exposures which may impact specific aspects of development. Recent research has identified windows of susceptibility in the association between air pollution exposure and lower birth weight, increased risk of preterm birth, and decreased childhood respiratory health, among other outcomes (Chang et al., 2012; Warren et al., 2013; Hsu et al., 2015; Bose et al., 2017). All of these studies estimate the association between time-varying exposure to a single environmental chemical and a health outcome; thus a gap exists with regards to time-varying exposure and chemical mixtures

Assessing the relationship between mixtures of time-varying exposures and a health outcome is complicated by several factors. These include: 1) high correlation between exposure levels at each time point; 2) high temporal correlation within each exposure; 3) potential nonlinear associations; and 4) potential interactions between both simultaneous and sequential exposures. Statistical approaches have been proposed to address each of these challenges individually. However, no single approach has been proposed to address all of these challenges simultaneously.

Several approaches have been proposed to estimate the association between exposure to a chemical mixture at a single time point and a health outcome (for reviews see: Billionnet et al., 2012; Davalos et al., 2017; Hamra and Buckley, 2018). Herring (2010) proposed using nonparametric Bayesian shrinkage priors in a parametric regression model. Several groups have proposed various clustering approaches that attempt to group multivariate exposures into clusters having similar multi-pollutant profiles and then quantify differences in a health outcomes among exposure clusters (Molitor et al., 2010, 2011; Austin et al., 2012; Zanobetti et al., 2014; Pearce et al., 2014). Carrico et al. (2014) proposed creating an index as a weighted average of multiple exposures. Bobb et al. (2015)

proposed Bayesian kernel machine regression (BKMR) to estimate a flexible, high-dimensional exposure-response surface. Other approaches include estimating the exposure-response relation using machine learning and variable selection methods, such as the adaptive lasso, elastic net, Bayesian adaptive regression trees, and classification and regression trees

(Taylor et al., 2016). All of these methods estimate the relationship between a health outcome and exposure to a chemical mixture measured at a single point in time, or averaged over a single pre-specified exposure window. In the context of maternal exposures to environmental chemicals, some of these methods have been applied using averages taken over either individual trimesters or over the entire pregnancy (e.g. Dadvand et al., 2013).

A separate literature has developed on statistical methods to estimate the association between time-varying measures of exposure and a subsequent health outcome using the distributed lag model (DLM) framework. DLMs were originally developed in the econometrics literature (Almon, 1965) and later applied in environmental epidemiology for time-series analysis where an outcome observed on a given day is regressed on exposures over a previous time period (Schwartz, 2000; Zanobetti et al., 2000)

. DLMs are often constrained so that the exposure effect varies smoothly over time. To accomplish this goal one can use either a parametric model (such as a quadratic function of time), splines, Bayesian priors, Gaussian processes, or other penalization approaches

(Zanobetti et al., 2000; Peng et al., 2009; Heaton and Peng, 2012; Chen et al., 2017). Distributed lag nonlinear models (DLNMs) are extensions of DLMs that include a nonlinear exposure-response relation for a single chemical exposure (Gasparrini et al., 2010; Gasparrini, 2011; Gasparrini et al., 2017). When applied to perinatal cohorts, a DLM regresses a child’s health or birth outcome on maternal exposures recorded daily or weekly during pregnancy. The DLM can then estimate windows of susceptibility–periods in time during which there is a increased association between maternal exposure recorded at a given time and the subsequent outcome of interest. Several recent studies have applied DLMs to estimate the relationship between maternal exposure to air pollution during pregnancy and children’s outcomes, including: preterm birth (Warren et al., 2012; Chang et al., 2015), decreased birth weight (Warren et al., 2013), childhood asthma and lung function (Hsu et al., 2015; Lee et al., 2018a, b), disrupted neurodevelopment (Chiu et al., 2016), body composition (Chiu et al., 2017), and epigeneic outcomes (Brunst et al., 2018).

To extend DLM methods beyond time-varying measures of a single chemical exposure, Chen et al. (2018) proposed a two-pollutant DLM to estimate the linear effect of exposure, including the interaction, by estimating an interaction surface. For linear association between a continuous outcome and multiple time-varying exposures Bello et al. (2017)

proposed a lagged weighted quantile sums. The approach regresses the time-varying exposures on the outcome to estimate the association but does not account for nonlinearities of interactions.

Liu et al. (2018b) and Liu et al. (2018a) developed lagged kernel machine regression to estimate nonlinear associations between exposures observed at multiple points in time and interactions between simultaneous exposures (i.e. those recorded at the same timepoint). The approach is appropriate for exposures observed at a small number of times that are common to all individuals in the study (e.g. blood biomarkers measured once per trimester) and estimates only interactions between exposures at a single time point and not between exposures at different times. None of the proposed approaches estimate the effects of multiple time-varying exposures, including nonlinearities and interactions between both simultaneous and subsequent exposures.

In this paper we make two contributions to the literature on methods to quantify windows of susceptibility to exposure associated with an environmental mixture. First, as a baseline of comparison, we adapt existing methods that accommodate either a single time-varying exposure or exposures to chemical mixtures assessed at a single time point to handle time-varying measures of the mixture. Specifically, we consider a straightforward additive extension of traditional DLMs and DLNMs to multi-pollutant models. These additive DLM and DLNM approaches accommodate data on multiple time-varying exposures, but do not allow for interactions between pollutants. We also evaluate the performance of BKMR applied using exposure averaged over pregnancy for multiple pollutants. This strategy allows for nonlinear associations and interactions but does not account for exposure timing. Our second contribution is to propose a novel Bayesian kernel machine regression distributed lag model (BKMR-DLM) framework. This new framework estimates the association between time-varying measures of multiple exposures and a health outcome. The approach uses the concept of time-weighted exposures (Wilson et al., 2017a) to identify windows of susceptibility and models the potentially nonlinear and non-additive association between these time-weighted exposures and a health outcome using kernel machine regression. BKMR-DLM accounts for exposure timing, allows for nonlinearities in each exposure-response relationship, and allows for interactions among exposures measured both concurrently and at different times. We conduct a simulation study to explore the relative strengths and weaknesses of each modeling strategy. We find that BKMR-DLM can more accurately estimate the exposure-response function than methods that fail to account for exposure timing, nonlinearities, and interactions, whereas the simpler additive DLM and DLNM have more power to detect windows of susceptibility. We apply the proposed approaches to data from the Asthma Coalition on Community, Environment, and Social Stress (ACCESS) cohort (Wright et al., 2008) and estimate the association between estimated maternal exposure to four air pollutants throughout pregnancy and birth weight.

2 Data Description

ACCESS is a prospective, longitudinal study designed to examine the effects of psychosocial stressors and chemical stressors (e.g., air pollution and other environmental influences) on children’s birth and health outcomes. ACCESS includes 955 mother-child dyads recruited between August 2002 and January 2007 who continued active follow-up after birth in the Boston, MA area. Procedures were approved by the human studies committees at Brigham and Women’s Hospital and Boston Medical Center.

Previous analyses of the ACCESS cohort identified an association between increased maternal exposure to fine particulate matter (PM) averaged over pregnancy and decreased birth weight for gestational age -score (BWGAz) particularly among boys born to obese mothers (Lakshmanan et al., 2015). In this paper we consider the association between weekly levels of exposure to four components of particulate matter–elemental carbon (EC), organic carbon (OC), nitrate, and sulfate–and BWGAz among the same population of boy babies with obese mothers. We include as covariates maternal age at enrollment, an indicator of maternal education at high school level or above, maternal pre-pregnancy body mass index (BMI), indicators of black and Hispanic race/ethnicity, parity, and an indicator of season on birth.

Maternal exposures of EC, OC, nitrate, and sulfate were previously estimated with a hybrid land use regression model that incorporates satellite-derived aerosol optical depth measures and a chemical-transport model GEOS-Chem (Di et al., 2016). Each mother was assigned an average exposure level for each pollutant for each week of pregnancy based on the predicted value at her address of residence. We limit our analysis to full-term infants (born at weeks gestation) and their exposure during the first 37 weeks of pregnancy. A total of 109 children had complete exposure, outcome, and covariate data.

3 BKMR and DLMs for time-varying measures of an environmental mixture

3.1 Objectives and notation

Interest focuses on estimating the association between time-varying exposure to pollutants and a scalar outcome , while controlling for a

-vector of baseline covariates

. We assume these quantities are observed for a sample of size , with subject indexed by , and the exposures are observed over the time domain , in our case week of gestation. There are two primary objectives: 1) to identify windows of susceptibility during which the time-varying exposures are associated with a future health outcome and 2) estimate the exposure-response relationship while allowing for a nonlinear and non-additive relationship between the multiple exposures and the outcome. In this section we introduce methods that can address only one of these goals–DLM, DLNM and BKMR–and show how these approaches can be readily applied to time-varying measures of a multi-pollutant mixture. We then introduce our proposed methods that can simultaneously address both goals in Section 4.

3.2 Additive DLMs for time-varying measures of an environmental mixture

The Gaussian discrete-time DLM for a single exposure is

(1)

where parameterizes the time-varying association between exposure and outcome, is a -vector of unknown regression coefficients for the confounders, and are independent random variables. Constrained DLMs impose smoothness on the distrusted lag function. The smoothness constraint can be imposed by modeling using splines, Bayesian priors, Gaussian processes, or other penalization approaches (Zanobetti et al., 2000; Peng et al., 2009; Heaton and Peng, 2012; Chen et al., 2017). For multiple time-varying exposures, an additive DLM is

(2)

In prior work, researchers typically identify a window of susceptibility as a period of time during which the pointwise interval for does not contain 0. We take that approach here. Additionally, the cumulative effect is the expected change in outcome associated with a one unit increase in exposure occurring simultaneously at every exposure time point, or .

In this paper, we use natural splines impose smoothness on

and select the degrees of freedom as the value that minimizes the Akaike Information Criterion (AIC). The model in (

2) is then a linear model. An additive DLM is appealing because it can easily be implemented with existing software and is easy to visualize and interpret. However, DLMs do not allow for nonlinear associations or interactions between exposures.

3.3 Additive DLMs for mixtures of time-varying exposures

In their distributed lag nonlinear model (DLNM) framework, Gasparrini et al. (2010) extended DLMs to accommodate a nonlinear association between outcome and exposure at any given time; that is, they replaced linear association in (1) with nonlinear association . They model the function as the outer-product of two basis expansions, which allows the nonlinear association between outcome and exposure at given time point to vary smoothly over time. The additive DLNM for time-varying measures of a mixture is then

(3)

In this paper we estimate additive DLNMs using penalized splines and the dlnm package (Gasparrini, 2011). Using penalized splines for is particularly appealing for mixtures because it facilities simultaneous tuning of each without comparing an unwieldy number of degrees of freedom for the splines bases.

DLNMs allow for nonlinear association between each exposure and the outcome but do not allow for interactions. The identification of a window of susceptibility is more challenging within the DLNM framework because windows may appear at some levels of exposure but not others.

3.4 BKMR applied to pregnancy-averaged exposures

BKMR is a popular approach to estimating the association between multiple scalar exposures and a health outcome. Unlike DLM and DLNM, BKMR allows for nonlinear associations and interactions, but does not account for exposure timing. For scalar exposures , a BKMR model takes the form

(4)

The function is a potentially nonlinear and non-additive exposure-response function. In the context of maternal exposures during pregnancy, we take to be the levels of the exposure averaged over 37 weeks of pregnancy.

BKMR assumes that the exposure-response function resides in the functional space that is uniquely defined by the positive semidefinite reproducing kernel . The function can be represented with a positive-definite kernel function and coefficients as . According to Mercer’s Theorem (Cristianini and Shawe-Taylor, 2000), the kernel implicitly specifies a basis expansion. For example, the Gaussian kernel

corresponds to the set of Gaussian radial basis functions. The polynomial kernel

corresponds to a polynomial of order up to .

Using the kernel representation of , Liu et al. (2007) showed that the regression model in (4) is equivalent to the hierarchical model

(5)

where is an matrix with element . In this work we consider both Gaussian and polynomial kernel functions within BKMR. The Gaussian kernel allows for flexible estimation of , while the a polynomial kernel potentially yields increased power to detect associations together with a concomitant reduction in flexibility. For the Gaussian kernel, the , element of is

(6)

In this paper, for BKMR analyses using exposures averaged over the first 37 weeks of gestation, we fit the Gaussian kernel model as implemented in the R package bkmr (Bobb, 2017; Bobb et al., 2018).

For the BKMR-models specified in the next section, we also considered models using a polynomial kernel with (i.e. a quadratic kernel). The polynomial kernel of order is

(7)

Note both the Gaussian and polynomial kernels include feature weights (Allen, 2013; Bobb et al., 2015).

4 BKMR-DLM: KMR with time-weighted exposures

4.1 BKMR-DLM model specification

The methods presented in Section 3.4 are well established for scalar exposures. The proposed BKMR-DLM approach simultaneously identifies windows of susceptibility using pollutant-specific weight functions and estimates the potentially nonlinear and non-additive associations between a health outcome and time-weighted exposures.

Time-weighted exposure are closely related to DLMs and functional regression methods. Wilson et al. (2017a) proposed estimating the linear association between a single time-varying exposure and health outcome using the model

(8)

where is a functional representation of the exposure and is a functional weight parameter defined over . The approach constraints and to the weight function for identifiability.

The model can be viewed as a linear regression model using the weighted exposure

as a scalar covariate: . It is equivalent to the functional DLM with functional predictor .

For BKMR-DLM we insert the weighted exposure into the kernel function (6) or (7). The Gaussian kernel is then

(9)

and the polynomial kernel is

(10)

Under the constraints and , both and are identifiable.

When is constant in time, then BKMR-DLM is equivalent to a BKMR model that uses pregnancy-averaged exposure as a predictor. When varies in time, BKMR-DLM up- or down-weights exposures during certain time periods. The up-weighted time periods represent windows of susceptibility to exposure.

For each time-varying exposure, we parameterize both and using a basis function representation (Morris, 2015). We assume and , where is an exposure-specific orthogonal basis expansion and and are regression coefficients. The weighted exposure for individual can then be rewritten as where and . We estimate

using ordinary least squares which gives

, where is the row-vector of observed exposures for chemical and person and is design matrix of orthonormal basis functions with columns being the basis expansion at time point .

Using an orthonormal basis, the constraint is satisfied if and only if . The constraint is satisfied for a set of observed times if and only if , where is a -vector of ones. As such, the constraints on are now constraints on . The constrained parameter space is half of a unit

-ball on one side of a hyperplane defined by

.

Using the weighted exposures as inputs, the Gaussian kernel function in (6) is

(11)

and the polynomial kernel in (7) is

(12)

The parameters and represent the importance and the timing, respectively, of exposure . Because these parameters are only jointly identifiable and to ease computation, we reparameterize the model in terms of . The Gaussian kernel in (6) is then

(13)

and the polynomial kernel in (7) can be written as

(14)

In the above model written in terms of , both and are uniquely identified by as and . Hence, we can estimate the full model parameterized in terms of and then partition the posterior sample of into and , where uniquely describes the weight function .

To induce smoothness in the weight function we use the eigenfunctions of the covariance matrix of smoothed exposures. Specifically, we pre-smooth each exposure with a parsimonious natural spline bases and then use the eigenfunctions of the covariance matrix of the smoothed exposures in the model as specified above. An alternative is to use the eigenfunctions of the smoothed covariance matrix of the exposures obtained by the fast covariance estimation (FACE) method proposed by

Xiao et al. (2016) and implemented in the R package refund (Goldsmith et al., 2016), as done in Wilson et al. (2017a). However, we find that pre-smoothing the exposures with a parsimonious basis is more reliable in practice, particularly for moderate sample sizes.

4.2 Prior specification and posterior computation

We use as a prior for ,

a uniform distribution over its parameter space, which can be written as

, where is an indicator function. We then let for fixed value . It follows that , with . We complete the prior specification by assuming a flat prior on , , and .

To estimate the model parameters, we first integrate out from (5). This yields , where . The posterior can be estimated using the decomposition

In (4.2), and are the typical multivariate normal and gamma full conditionals. The final term in (4.2) takes the form

Our Markov chain Monte Carlo (MCMC) algorithm iteratively samples each

using an elliptical slice sampler (Murray et al., 2010) and the kernel of (4.2). Then we sample using random walk Metropolis-Hastings and the same integrated kernel in (4.2). Finally, we use a Gibbs sampler to sample , , and from their respective full conditionals. The full conditional for

is a generalized inverse-Gaussian distribution with density function

, where , , and . Algorithm 1 in the supplemental material shows the full MCMC approach.

4.3 Posterior inference for

Windows of susceptibility during which there is an increased association between exposure and outcome are identified using the estimated weight function. Let for be the posterior sample of size . We can identify and

. It is then straightforward to compute the pointwise posterior mean and credible interval for

. While the credible interval provides valid pointwise posterior inference for w(t) and can be used to identify windows of susceptibility, the posterior mean does not satisfy the constraint . We use the point estimate projected onto the parameter space of : with and

is the posterior mean. The resulting estimator, equivalent to the Bayes estimate with respect to the loss function

, is a central estimate in the parameter space of .

4.4 Posterior inference for

Estimates of for the observed exposure levels can be obtained by sampling from conditional distribution of from (5). Specifically, for each MCMC iteration, we sample from

(17)

To predict at new values, including over a regularly spaced grid, we predict

by considering the joint distribution

(18)

and the subsequent posterior predictive distribution

5 Simulation Study

We evaluate the effectiveness of the proposed BKMR-DLM and other methods for mixtures of time-varying exposures in two scenarios: one with two exposures and the second with five exposures. The study is design to both evaluate the effectiveness of BKMR-DLM and to identify the best method to estimate the mutlipollutant exposure-response relation and to identify windows of susceptibility in the setting of mixtures of time-varying exposures.

5.1 Scenario A: two pollutants

In scenario A we considered two exposures and compared BKMR-DLM with: 1) BKMR using mean exposure over pregnancy; 2) an additive DLM; and 3) an additive DLNM. To compare the results of DLM and DLNM to the true simulated weight functions, we normalized the estimates to match the constraints imposed on the true weight functions. For DLNM, where the distributed lag function varies smoothly with concentration and windows may only be identified at some concentrations, we made this comparison using the cross-section of the DLNM that shows the time-varying association at the mean exposure level for each pollutant. We fit BKMR-DLM using both a Gaussian kernel and a polynomial kernel of degree two.

For the simulation, we used real exposure data for two pollutant, PM and nitrogen dioxide (NO), taken from one Boston, MA, USA monitor and created weekly mean exposure values for births simulated at randomly selected birth dates. This simulation strategy yields a realistic correlation structure among weekly exposures within a pregnancy, as well as seasonal variation in exposure across multiple pregnancies. The simulated weight functions were a normal density function peaking mid-pregnancy () and a logistic link function identifying a window in the second half of gestation (). Both weight functions were truncated to span 37 weeks and scaled to meet the constraint (see Figure 3 for a visualization).

We simulated the outcomes using the model . The exposure-response function, , was with , , and and and are scaled and centered versions of the weighted exposures. Hence, was nonlinear in both and and there was a multiplicative interaction between and . We assumed five covariates, , each simulated independent standard normal and the covariate regression coefficients, , was also simulated independent standard normal. The random error

was simulated normal mean zero and standard deviation 3, 7.5, and 15 which represent approximately a 1:2, 1:5, and 1:10 ratio between the standard deviation of

and . Finally, we considered sample sizes of and and evaluated model permanence based on 200 simulated data sets.

We note that, because the multi-dimensionality of the problem (exposure timing, multiple pollutants, potential nonlinearity and non-additivity of the expoure-response function), we purposely simulate under a scenario in which all models are misspecified. BKMR-DLM is the only model that accounts for exposure-timing, nonlinearity, and interactions. However, the natural spline basis that we employ (4 degrees of freedom) is not sufficiently flexible to model the simulated weight functions and the polynomial kernel is not sufficiently flexible to accurately represent the true exposure-response function. We include additional scenarios C and D in the supplemental material for which BKMR-DLM is perfectly specified.

Figure 3 shows the true weight functions and the estimated weight functions for the first 100 simulated data sets from for and error standard deviation using both the Gaussian and polynomial kernel. Overall both approaches estimated the general pattern but over-smooth the weight function for the first exposure due to the fact that the spline basis used is not sufficiently flexible to match the peak of the window in the middle of pregnancy. Estimated weight functions for other settings are shown in the supplemental material. Additional simulation scenarios C and D with smoother weight functions are also shown in supplement. In those scenarios BKMR-DLM and BKMR-DLM-poly are better able to estimate the true weight function.

Table 1 compares the simulated performance of the five approaches. Overall, differences in root mean square error (RMSE) for

show that BKMR-DLM with a polynomial kernel estimated the exposure-response function characterizing the association between exposure and health the best. This is despite the fact that the polynomial kernel is not sufficiently flexible to match the true exposure-response function. For the larger sample size and smaller error variance scenario, the Gaussian kernel also performed well but lacked power to estimate the exposure-response function in the more challenging scenarios. For a small sample size and larger error variance, BKMR applied to the 37-week average exposures performed slightly better. While BKMR using 37-week average exposures had lower RMSE in some cases, interval coverage was well below the nominal level. BKMR-DLM with a polynomial kernel did not have 95% coverage for

in some scenarios due the insufficient flexibility of the quadratic kernel. Direct comparison of BKMR-DLM with a Gaussian kernel and BKMR with averaged exposures, which also uses the same Gaussian kernel, highlights the importance accounting for exposure timing. When there exists a sufficiently strong signal that provides information about exposure timing, the BKMR-DLM model better estimates the exposure-response function. DLM and DLNM also had low interval coverage and had larger RMSE for than BKMR-DLM with a polynomial kernel, illustrating the importance of accounting for interactions.

Similarly, BKMR-DLM had the lowest RMSE for estimating the weight functions in all scenarios except the lower signal-to-noise ones. However, The intervals for the weight functions from BKMR-DLM did not have 95% coverage. DLM and DLNM, which do not account for interactions, had larger RMSE for the weight functions and did not have adequate interval coverage. However, DLM was the most likely to identify a window of susceptibility due to increased power resulting from parsimony.

(a) BKMR-DLM with Gaussian kernel.
(b) BKMR-DLM with polynomial kernel of order up to 2.
Figure 3: Simulated and estimated with functions for scenario A. The thick black line shows the true weight function. Exposure 1 has weight function and exposure 2 has . The thin grey lines show the estimated weight functions for the first 100 simulated data sets. The dashed black line shows the mean weight function across all 200 simulated data sets. Results are for scenario A with and . Results for additional scenarios are shown in supplementary material.
Model RMSE Coverage RMSE Coverage Pr(window)
, noise:
BKMR 1.351 0.601 NA NA NA
BKMR-DLM 1.201 0.978 0.600 0.888 0.062
BKMR-DLM-poly 1.027 0.917 0.543 0.888 0.180
DLM 1.176 0.838 0.812 0.860 0.630
DLNM 1.484 0.972 1.182 0.845 0.282
, noise:
BKMR 1.676 0.827 NA NA NA
BKMR-DLM 2.900 0.992 0.719 0.903 0.000
BKMR-DLM-poly 1.766 0.974 0.709 0.891 0.018
DLM 2.216 0.922 1.076 0.819 0.320
DLNM 3.417 1.000 1.268 0.827 0.230
, noise:
BKMR 2.287 0.935 NA NA NA
BKMR-DLM 5.175 0.998 0.732 0.905 0.000
BKMR-DLM-poly 2.958 0.992 0.757 0.891 0.002
DLM 4.155 0.937 1.207 0.794 0.238
DLNM 6.748 1.000 1.302 0.815 0.228
, noise:
BKMR 1.231 0.405 NA NA NA
BKMR-DLM 0.622 0.932 0.453 0.832 0.792
BKMR-DLM-poly 0.604 0.856 0.418 0.819 0.908
DLM 0.888 0.604 0.596 0.828 0.990
DLNM 0.865 0.601 0.940 0.887 0.425
, noise:
BKMR 1.401 0.577 NA NA NA
BKMR-DLM 1.295 0.993 0.634 0.886 0.060
BKMR-DLM-poly 1.094 0.920 0.568 0.889 0.145
DLM 1.227 0.839 0.844 0.839 0.592
DLNM 1.579 0.984 1.137 0.870 0.225
, noise:
BKMR 1.613 0.758 NA NA NA
BKMR-DLM 2.509 1.000 0.731 0.899 0.000
BKMR-DLM-poly 1.605 0.975 0.682 0.894 0.012
DLM 1.949 0.921 1.066 0.817 0.360
DLNM 2.956 1.000 1.233 0.858 0.182
Table 1: Simulation results for scenario A, two active exposures. The table shows statistics summarizing the estimation of at each observed value of the multivariate exposure, and estimation of the two weight functions and .

5.2 Scenario B: five pollutants

Scenario B was largely the same as scenario A except we included five exposures (PM, NO, CO, O, and SO), all taken from the same Boston monitor. The first two exposures had the same weights and exposure-response functions as described above (same , , , , and ). We added a third active exposure, CO, and let and . We added two additional exposures, O and SO, that had no association with the outcome. All other details remained the same as in scenario A.

Table 2 shows results for scenario B. For the purpose of estimating the exposure-response and weight functions, the pattern of relative performance of the different methods was similar to that in scenario A. For a larger sample size and lower error variance, BLMR-DLM with a polynomial kernel had the lowest RMSE for and for the weight functions. For smaller sample sizes and larger error variance, BKMR using 37-week average exposure yielded lower RMSE for but also low interval coverage. BKMR-DLM did not have sufficient power to identify windows of susceptibility; however, the improvement in estimation of and the weight functions indicate that, despite not formally identifying windows associated with exposure, some information can be learned about the timing, and this information can improve exposure-response estimation. While DLM and DLNM did not estimate the exposure-response function as well as BKMD-DLM with the polynomial kernel based on RMSE, these two simpler models had the best power to identify windows of susceptibility.

90 Model RMSE Coverage RMSE Coverage Pr(windowactive) Pr(windownull) , noise: BKMR 1.639 0.637 NA NA NA NA BKMR-DLM 1.856 0.947 0.658 0.937 0.002 0.000 BKMR-DLM-poly 1.283 0.950 0.608 0.919 0.090 0.005 DLM 1.462 0.915 1.425 0.706 0.455 0.398 DLNM 2.194 0.998 1.311 0.839 0.235 0.260 , noise: BKMR 1.988 0.793 NA NA NA NA BKMR-DLM 3.323 0.986 0.679 0.937 0.000 0.000 BKMR-DLM-poly 2.388 0.984 0.696 0.925 0.005 0.000 DLM 3.240 0.939 1.413 0.783 0.297 0.300 DLNM 5.367 1.000 1.319 0.844 0.210 0.215 , noise: BKMR 2.548 0.913 NA NA NA NA BKMR-DLM 5.947 0.995 0.682 0.937 0.000 0.000 BKMR-DLM-poly 4.277 0.992 0.719 0.927 0.003 0.000 DLM 6.332 0.942 1.392 0.803 0.268 0.275 DLNM 10.697 1.000 1.327 0.842 0.223 0.220 , noise: BKMR 1.167 0.609 NA NA NA NA BKMR-DLM 0.745 0.972 0.429 0.920 0.362 0.008 BKMR-DLM-poly 0.714 0.936 0.436 0.902 0.450 0.030 DLM 0.879 0.815 1.387 0.574 0.740 0.715 DLNM 1.048 0.901 1.249 0.841 0.268 0.310 , noise: BKMR 1.701 0.599 NA NA NA NA BKMR-DLM 2.137 0.994 0.655 0.937 0.000 0.000 BKMR-DLM-poly 1.322 0.969 0.610 0.919 0.092 0.000 DLM 1.567 0.904 1.428 0.695 0.453 0.385 DLNM 2.321 1.000 1.261 0.865 0.182 0.182 , noise: BKMR 1.971 0.680 NA NA NA NA BKMR-DLM 3.052 1.000 0.687 0.937 0.000 0.000 BKMR-DLM-poly 2.071 0.993 0.678 0.926 0.012 0.000 DLM 2.830 0.934 1.421 0.764 0.285 0.290 DLNM 4.552 1.000 1.268 0.867 0.165 0.178

Table 2: Simulation results for scenario B, three active exposure and one null. The table shows statistics summarizing the estimation of , the mean response at each observed value the estimation of the weight functions.

6 Data Analysis

We applied BKMR-DLM with a quadratic kernel to birthweight z-score in the ACCESS cohort. For comparison, we considered an additive DLM model and BKMR with 37-week averaged exposure. In total, 109 mother-child dyads were analyzed. The analysis included four pollutant indicators: nitrate, OC, EC, and sulfate. All covariates listed in Section 

2 were included in each model.

Figure (a)a shows results from the additive DLM analysis. We identified a susceptibility window in weeks 29-33 for nitrate, in weeks 26-27 for OC, and weeks 9-13 in sulfate. There was moderate evidence of a cumulative effect, representing the change in birth weight associated with a one unit increase in exposure at every time point, for sulfate (-value) but not for any other pollutant.

Figure (b)b shows the estimated weight functions from BKMR-DLM. No windows were identified. Recall that the weight functions are constrained. As such, the magnitude and sign do not reflect that of the association. They do identify periods of time with increased importance as periods where there is a “bump” in the weight function. With that in mind, the weight functions for OC and sulfate show similarities in shape to the estimated distributed lag functions from the additive DLM but with increased uncertainty.

(a) Estimated association between exposures and birth weight -score in ACCESS using the additive DLMs. The distributed lag function is the estimated association at each time point.
(b) Estimated weight functions from the analysis of birth weight -score in ACCESS from BKMR-DLM. The weight function is constrained and does not reflect the magnitude of the association or the direction of the association. It only reflects the timing of the association.
Figure 6: Estimated windows of susceptibility for birth weight -score in ACCESS from single pollutant DLM models (panel (a)a) and with the multipollutant BKMR-DLM (panel (b)b)

Figure 7 shows estimates of the exposure-response function from BKMR-DLM. The diagonal shows as a function of weighted exposure for one pollutant at the median value of the weighted exposures for the other pollutants. We found some evidence of a negative association between OC and BWGAz. Supplemental Figures S9 and S10 shows a similar estimated association using BKMR with 37-week averaged exposures. The off diagonals show the posterior mean of at different quantiles of one co-pollutant and the median of the other two co-pollutants. There was slight evidence that nitrate modifies the OC, EC, and sulfate exposure-response functions and that sulfate may modify the OC exposure response. Supplement Figures S11 and S12 and Table S3 show results for univariate DLM analyses of OC, EC, and sulfate stratified by nitrate and OC stratified by sulfate. This simpler analysis supports the conclusion that there may be an association between OC, EC, and sulfate at higher levels of nitrate. This follow-up analysis showed that the windows of susceptibility associated with OC, EC, and sulfate only exist at higher levels of co-exposure to nitrate.

Figure 7: Cross sections of the estimated exposure-response function (). Each column shows the exposure-response function as the weighted exposure for a single pollutant varies. The panels on the diagonal show the main effect, the association between a weighted exposure and the outcome at the median level of all other weighted exposures. The off-diagonals show the exposure-response function at different quantiles of the co-exposure.

7 Discussion

In this paper we consider multiple strategies for quantifying the association between time-varying measures of an environmental mixture and a prospectively assessed health outcome, and compared the relative advantages and disadvantages of each approach through simulation and application of each to a case study involving prenatal exposures to multiple air pollutants and birth weight. In this setting there are three key challenges: accounting for exposure timing, accounting for nonlinear associations, and accounting for interactions. Table 3 summarizes methods that can be applied in this setting and the advantages and disadvantages of each approach.

We proposed BKMR-DLM to estimate the association between multiple time-varying exposures and an outcome. To our knowledge, this is the first approach that accounts for exposure timing, interactions between exposures, and nonlinear associations–thereby more comprehensively modeling the underlying complexity of the relationships. The approach uses time-weighted exposures in a kernel machine regression framework. The weight-functions identify windows of susceptibility during which there is an increased association between exposure and outcome. Such information will be important as developmental processes are both timed and linked to windows of susceptibility, thus exposure timing provides hints to the underlying mechanisms. By using kernel machine regression we allow for nonlinear associations and interactions among the multiple weighted exposures.

Method accounts for by design Method recommended use
Estimating ER
exposure inter- non- detecting wk signal st signal
Method timing actions linearity windows modest large
Add. DLM
Add. DLNM
BKMR
BKMR-DLM

denotes weaker signal and stronger signal, respectively.

ER: exposure-response

Table 3: Summary of methods and performance. The first three columns show what data features the method accounts for by design. The last three columns summarize relative performance of the methods and indicates the recommended method based on study objectives of estimating the exposure-response function (ER) and detecting windows of susceptibility. These recommendations are based on the the simulation study results.

In addition to this novel method we considered BKMR using averaged exposures. BKMR allows for nonlinear associations and interactions but does not account for exposure timing. Further, we used additive versions of both DLMs and DLNMs to estimate exposure-response relationships. DLM accounts for exposure timing but limits the analysis to linear associations and no interactions. DLNM accounts for exposure-timing and nonlinear associations, but also limits the analysis to main effects only.

In a simulation study, we showed that BKMR-DLM with a polynomial kernel function was best able to estimate the exposure-response relation in most situations. When the sample size is small or the signal-to-noise ratio is small, BKMR, which is a simpler model because exposure timing is not estimated, performed slight better. While DLNM and DLM were not as accurate in estimating the exposure-response relationship, these approaches had greater power to identify windows of susceptibility. DLM, which is the simplest model, was the most effective at identifying windows of susceptibility. Hence, while BKMR-DLM has the advantage of accounting for exposure timing, nonlinear associations, and interactions in a single model and performed best with estimation of the exposure-response function, the preferred method for time-varying measures of a multi-pollutant mixture may depend on both the data, the strength of the exposure-response relationship, and the primary study objectives.

We applied this strategy of combining the use of BKMR-DLM to identify potential interactions and DLM to identify windows of susceptibility to analyze data on prenatal exposure to an air pollution mixture and birth weight in the ACCESS cohort. In a sample of 109 boys born to obese mothers, we estimated the association between four ambient pollutants and birth weight using BKMR-DLM as a screening method to identify potential interactions between time-varying exposures. We then performed a stratified DLM analysis to confirm that the findings from the primary BKMR-DLM analysis were not driven solely by modeling assumptions.

The analysis in this paper included 109 dyads and four pollutants each estimated at 37 time points. While this analysis may have been slightly under-powered there are several larger studies that will be yielding larger data in the near future. In particular the National Institutes of Health Environmental influences on Child Health Outcomes (ECHO) program is pooling similarly designed birth cohorts and will provide the potential and power to investigate this complex interplay of multiple pollutants, complex exposure-response functions, and timing. Therefore the knowledge gained from this investigation will provide guidelines for the statistical analysis of these future studies

Both the estimation of health effects associated with multi-pollutant mixtures and the identification of windows of susceptibility are important areas of environmental health research. BKMR-DLM and other related distributed lag methods are tools that can be used to combine these two important areas of research and simultaneously estimate windows of susceptibility and milti-pollutant exposure-response functions, while properly quantifying uncertainty arising from uncertain exposure timing.

Acknowledgements

This work was supported in part by NIH grants R01ES028811, R01ES013744, P30ES000002, P30ES023515, and UH3OD023337 and US EPA grant RD-83587201. Its contents are solely the responsibility of the grantee and do not necessarily represent the official views of the US EPA. Further, US EPA does not endorse the purchase of any commercial products or services mentioned in the publication. The ACCESS cohort has been supported by NIH grants R01ES010932, U01HL072494, and R01HL080674. This work utilized the RMACC Summit supercomputer, which is supported by the NSF (awards ACI-1532235 and ACI-1532236), the University of Colorado Boulder and Colorado State University. The RMACC Summit supercomputer is a joint effort of the University of Colorado Boulder and Colorado State University.

References

  • Allen (2013) Allen, G. I. (2013).

    Automatic feature selection via weighted kernels and regularization.

    Journal of Computational and Graphical Statistics, 22(2):284–299.
  • Almon (1965) Almon, S. (1965). The Distributed Lag Between Capital Appropriations and Expenditures. Econometrica, 33(1):178–196.
  • Austin et al. (2012) Austin, E., Coull, B., Thomas, D., and Koutrakis, P. (2012). A framework for identifying distinct multipollutant profiles in air pollution data. Environment International, 45(1):112–121.
  • Bello et al. (2017) Bello, G. A., Arora, M., Austin, C., Horton, M. K., Wright, R. O., and Gennings, C. (2017). Extending the Distributed Lag Model framework to handle chemical mixtures. Environmental Research, 156(December 2016):253–264.
  • Billionnet et al. (2012) Billionnet, C., Sherrill, D., and Annesi-Maesano, I. (2012). Estimating the Health Effects of Exposure to Multi-Pollutant Mixture. Annals of Epidemiology, 22(2):126–141.
  • Bobb (2017) Bobb, J. F. (2017). bkmr: Bayesian Kernel Machine Regression.
  • Bobb et al. (2018) Bobb, J. F., Claus Henn, B., Valeri, L., and Coull, B. A. (2018). Statistical software for analyzing the health effects of multiple concurrent exposures via Bayesian kernel machine regression. Environmental Health, 17(1):67.
  • Bobb et al. (2015) Bobb, J. F., Valeri, L., Claus Henn, B., Christiani, D. C., Wright, R. O., Mazumdar, M., Godleski, J. J., and Coull, B. A. (2015). Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures. Biostatistics, 16(3):493–508.
  • Bose et al. (2017) Bose, S., Chiu, Y.-H. M., Hsu, H.-H. L., Di, Q., Rosa, M. J., Lee, A., Kloog, I., Wilson, A., Schwartz, J., Wright, R. O., Cohen, S., Coull, B. A., and Wright, R. J. (2017). Prenatal Nitrate Exposure and Childhood Asthma. Influence of Maternal Prenatal Stress and Fetal Sex. American Journal of Respiratory and Critical Care Medicine, 196(11):1396–1403.
  • Braun et al. (2016) Braun, J. M., Gennings, C., Hauser, R., and Webster, T. F. (2016). What can epidemiological studies tell us about the impact of chemical mixtures on human health? Environmental Health Perspectives, 124(1):A6–A9.
  • Brunst et al. (2018) Brunst, K. J., Sanchez-Guerra, M., Chiu, Y.-H. M., Wilson, A., Coull, B. A., Kloog, I., Schwartz, J., Brennan, K. J., Bosquet Enlow, M., Wright, R. O., Baccarelli, A. A., and Wright, R. J. (2018). Prenatal particulate matter exposure and mitochondrial dysfunction at the maternal-fetal interface: Effect modification by maternal lifetime trauma and child sex. Environment International, 112(March 2018):49–58.
  • Carrico et al. (2014) Carrico, C., Gennings, C., Wheeler, D. C., and Factor-Litvak, P. (2014). Characterization of Weighted Quantile Sum Regression for Highly Correlated Data in a Risk Analysis Setting. Journal of Agricultural, Biological, and Environmental Statistics, 20(1):100–120.
  • Chang et al. (2012) Chang, H. H., Reich, B. J., and Miranda, M. L. (2012). Time-to-Event Analysis of Fine Particle Air Pollution and Preterm Birth: Results From North Carolina, 2001-2005. American Journal of Epidemiology, 175(2):91–98.
  • Chang et al. (2015) Chang, H. H., Warren, J. L., Darrow, L. A., Reich, B. J., and Waller, L. A. (2015). Assessment of critical exposure and outcome windows in time-to-event analysis with application to air pollution and preterm birth study. Biostatistics, 16(3):509–521.
  • Chen et al. (2018) Chen, Y.-H., Mukherjee, B., and Berrocal, V. J. (2018). Distributed lag interaction models with two pollutants. Journal of the Royal Statistical Society: Series C (Applied Statistics).
  • Chen et al. (2017) Chen, Y.-H., Mukherjee, B., Berrocal, V. J., and Coull, B. A. (2017). Robust distributed lag models using data adaptive shrinkage. Biostatistics, (August):1–18.
  • Chiu et al. (2016) Chiu, Y.-H. M., Hsu, H.-H. L., Coull, B. A., Bellinger, D. C., Kloog, I., Schwartz, J., Wright, R. O., and Wright, R. J. (2016). Prenatal particulate air pollution and neurodevelopment in urban children: Examining sensitive windows and sex-specific associations. Environment International, 87:56–65.
  • Chiu et al. (2017) Chiu, Y.-H. M., Hsu, H.-H. L., Wilson, A., Coull, B. A., Pendo, M. P., Baccarelli, A., Kloog, I., Schwartz, J., Wright, R. O., Taveras, E. M., and Wright, R. J. (2017). Prenatal particulate air pollution exposure and body composition in urban preschool children: Examining sensitive windows and sex-specific associations. Environmental Research, 158(March):798–805.
  • Cristianini and Shawe-Taylor (2000) Cristianini, N. and Shawe-Taylor, J. (2000).

    An introduction to support vector machines and other kernel-based learning methods

    .
    Cambridge University Press.
  • Dadvand et al. (2013) Dadvand, P., Parker, J., Bell, M. L., Bonzini, M., Brauer, M., Darrow, L. A., Gehring, U., Glinianaia, S. V., Gouveia, N., Ha, E.-h., Leem, J. H., van den Hooven, E. H., Jalaludin, B., Jesdale, B. M., Lepeule, J., Morello-Frosch, R., Morgan, G. G., Pesatori, A. C., Pierik, F. H., Pless-Mulloli, T., Rich, D. Q., Sathyanarayana, S., Seo, J., Slama, R., Strickland, M., Tamburic, L., Wartenberg, D., Nieuwenhuijsen, M. J., and Woodruff, T. J. (2013). Maternal Exposure to Particulate Air Pollution and Term Birth Weight: A Multi-Country Evaluation of Effect and Heterogeneity. Environmental Health Perspectives, 121(3):267–373.
  • Davalos et al. (2017) Davalos, A. D., Luben, T. J., Herring, A. H., and Sacks, J. D. (2017). Current approaches used in epidemiologic studies to examine short-term multipollutant air pollution exposures. Annals of Epidemiology, 27(2):145–153.e1.
  • Di et al. (2016) Di, Q., Koutrakis, P., and Schwartz, J. (2016). A hybrid prediction model for PM2.5 mass and components using a chemical transport model and land use regression. Atmospheric Environment, 131:390–399.
  • Gasparrini (2011) Gasparrini, A. (2011). Distributed Lag Linear and Non-Linear Models in R: The Package dlnm. Journal of statistical software, 43(8):1–20.
  • Gasparrini et al. (2010) Gasparrini, A., Armstrong, B., and Kenward, M. G. (2010). Distributed lag non-linear models. Statistics in Medicine, 29(November 2009):2224–2234.
  • Gasparrini et al. (2017) Gasparrini, A., Scheipl, F., Armstrong, B., and Kenward, M. G. (2017). A penalized framework for distributed lag non-linear models. Biometrics.
  • Goldsmith et al. (2016) Goldsmith, J., Scheipl, F., Huang, L., Wrobel, J., Gellar, J., Harezlak, J., McLean, M. W., Swihart, B., Xiao, L., Crainiceanu, C., and Reiss, P. T. (2016). refund: Regression with Functional Data.
  • Hamra and Buckley (2018) Hamra, G. B. and Buckley, J. P. (2018). Environmental Exposure Mixtures: Questions and Methods to Address Them. pages 160–165.
  • Heaton and Peng (2012) Heaton, M. J. and Peng, R. D. (2012). Flexible Distributed Lag Models using Random Functions with Application to Estimating Mortality Displacement from Heat-Related Deaths. Journal of Agricultural, Biological, and Environmental Statistics, 17(3):313–331.
  • Herring (2010) Herring, A. H. (2010). Nonparametric bayes shrinkage for assessing exposures to mixtures subject to limits of detection. Epidemiology, 21(4):S71–6.
  • Hsu et al. (2015) Hsu, H.-H. L., Chiu, Y.-H. M., Coull, B. A., Kloog, I., Schwartz, J., Lee, A., Wright, R. O., and Wright, R. J. (2015). Prenatal Particulate Air Pollution and Asthma Onset in Urban Children. Identifying Sensitive Windows and Sex Differences. American Journal of Respiratory and Critical Care Medicine, 192(9):1052–1059.
  • Lakshmanan et al. (2015) Lakshmanan, A., Chiu, Y.-H. M., Coull, B. A., Just, A. C., Maxwell, S. L., Schwartz, J., Gryparis, A., Kloog, I., Wright, R. J., and Wright, R. O. (2015). Associations between prenatal traffic-related air pollution exposure and birth weight: Modification by sex and maternal pre-pregnancy body mass index. Environmental research, 137:268–277.
  • Lee et al. (2018a) Lee, A., Leon Hsu, H.-H., Mathilda Chiu, Y.-H., Bose, S., Rosa, M. J., Kloog, I., Wilson, A., Schwartz, J., Cohen, S., Coull, B. A., Wright, R. O., and Wright, R. J. (2018a). Prenatal fine particulate exposure and early childhood asthma: Effect of maternal stress and fetal sex. Journal of Allergy and Clinical Immunology, 141(5):1880–1886.
  • Lee et al. (2018b) Lee, A. G., Le Grand, B., Hsu, H.-H. L., Chiu, Y.-H. M., Brennan, K. J., Bose, S., Rosa, M. J., Brunst, K. J., Kloog, I., Wilson, A., Schwartz, J., Morgan, W., Coull, B. A., Wright, R. O., Baccarelli, A. A., and Wright, R. J. (2018b). Prenatal fine particulate exposure associated with reduced childhood lung function and nasal epithelia GSTP1 hypermethylation: Sex-specific effects. Respiratory Research, 19(1):76.
  • Liu et al. (2007) Liu, D., Lin, X., and Ghosh, D. (2007).

    Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models.

    Biometrics, 63(4):1079–88.
  • Liu et al. (2018a) Liu, S. H., Bobb, J. F., Claus Henn, B., Schnaas, L., Tellez-Rojo, M. M., Gennings, C., Arora, M., Wright, R. O., Coull, B. A., and Wand, M. P. (2018a). Modeling the health effects of time-varying complex environmental mixtures: Mean field variational Bayes for lagged kernel machine regression. Environmetrics, 29(4):1–17.
  • Liu et al. (2018b) Liu, S. H., Bobb, J. F., Lee, K. H., Gennings, C., Claus Henn, B., Bellinger, D., Austin, C., Schnaas, L., Tellez-Rojo, M. M., Hu, H., Wright, R. O., Arora, M., and Coull, B. A. (2018b). Lagged kernel machine regression for identifying time windows of susceptibility to exposures of complex mixtures. Biostatistics, 19(3):325–341.
  • Molitor et al. (2010) Molitor, J., Papathomas, M., Jerrett, M., and Richardson, S. (2010). Bayesian profile regression with an application to the National Survey of Children’s Health. Biostatistics, 11(3):484–498.
  • Molitor et al. (2011) Molitor, J., Su, J. G., Molitor, N. T., Rubio, V. G., Richardson, S., Hastie, D., Morello-Frosch, R., and Jerrett, M. (2011). Identifying vulnerable populations through an examination of the association between multipollutant profiles and poverty. Environmental Science and Technology, 45(18):7754–7760.
  • Morris (2015) Morris, J. S. (2015). Functional Regression. Annual Review of Statistics and Its Application, 2(1):321–359.
  • Murray et al. (2010) Murray, I., Adams, R. P., and MacKay, D. J. (2010). Elliptical slice sampling. Journal of Machine Learning Research: W&CP, 9:541–548.
  • Pearce et al. (2014) Pearce, J. L., Waller, L. A., Chang, H. H., Klein, M., Mulholland, J. A., Sarnat, J. A., Sarnat, S. E., Strickland, M. J., and Tolbert, P. E. (2014).

    Using self-organizing maps to develop ambient air quality classifications: a time series example.

    Environmental Health, 13(1):56.
  • Peng et al. (2009) Peng, R. D., Dominici, F., and Welty, L. J. (2009). A Bayesian hierarchical distributed lag model for estimating the time course of risk of hospitalization associated with particulate matter air pollution. Journal of the Royal Statistical Society: Series C, 58(1):3–24.
  • Schwartz (2000) Schwartz, J. D. (2000). The distributed lag between air pollution and daily deaths. Epidemiology, 11(3):320–326.
  • Taylor et al. (2016) Taylor, K. W., Joubert, B. R., Braun, J. M., Dilworth, C., Gennings, C., Hauser, R., Heindel, J. J., Rider, C. V., Webster, T. F., and Carlin, D. J. (2016). Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in Epidemiology: Lessons from an Innovative Workshop. Environmental Health Perspectives, 124(12):227–229.
  • Warren et al. (2012) Warren, J., Fuentes, M., Herring, A. H., and Langlois, P. H. (2012). Spatial-temporal modeling of the association between air pollution exposure and preterm birth: identifying critical windows of exposure. Biometrics, 68(4):1157–67.
  • Warren et al. (2013) Warren, J. L., Fuentes, M., Herring, A. H., and Langlois, P. H. (2013). Air Pollution Metric Analysis While Determining Susceptible Periods of Pregnancy for Low Birth Weight. ISRN Obstetrics and Gynecology, 2013:1–9.
  • Wilson et al. (2017a) Wilson, A., Chiu, Y.-H. M., Hsu, H.-H. L., Wright, R. O., Wright, R. J., and Coull, B. A. (2017a). Bayesian distributed lag interaction models to identify perinatal windows of vulnerability in children’s health. Biostatistics, 18(3):537–552.
  • Wilson et al. (2017b) Wilson, A., Chiu, Y.-H. M., Hsu, H.-H. L., Wright, R. O., Wright, R. J., and Coull, B. A. (2017b). Potential for Bias When Estimating Critical Windows for Air Pollution in Children’s Health. American Journal of Epidemiology, 186(11):1281–1289.
  • Woodruff et al. (2011) Woodruff, T. J., Zota, A. R., and Schwartz, J. M. (2011). Environmental Chemicals in Pregnant Women in the United States: NHANES 2003–2004. Environmental Health Perspectives, 119(6):878–885.
  • Wright et al. (2008) Wright, R. J., Suglia, S. F., Levy, J., Fortun, K., Shields, A., Subramanian, S., and Wright, R. (2008). Transdisciplinary research strategies for understanding socially patterned disease: the Asthma Coalition on Community, Environment, and Social Stress (ACCESS) project as a case study. Ciencia & Saude Coletiva, 13(6):1729–1742.
  • Wright (2017) Wright, R. O. (2017). Environment, susceptibility windows, development, and child health. Current Opinion in Pediatrics, 29(2):211–217.
  • Xiao et al. (2016) Xiao, L., Zipunnikov, V., Ruppert, D., and Crainiceanu, C. (2016). Fast covariance estimation for high-dimensional functional data. Statistics and Computing, 26(1-2):409–421.
  • Zanobetti et al. (2014) Zanobetti, A., Austin, E., Coull, B. A., Schwartz, J., and Koutrakis, P. (2014). Health effects of multi-pollutant profiles. Environment International, 71:13–19.
  • Zanobetti et al. (2000) Zanobetti, A., Wand, M. P., Schwartz, J., and Ryan, L. M. (2000). Generalized additive distributed lag models: quantifying mortality displacement. Biostatistics (Oxford, England), 1(3):279–92.