Performance of variable and function selection methods for estimating the non-linear health effects of correlated chemical mixtures: a simulation study

08/05/2019
by   Nina Lazarevic, et al.
0

Statistical methods for identifying harmful chemicals in a correlated mixture often assume linearity in exposure-response relationships. Non-monotonic relationships are increasingly recognised (e.g., for endocrine-disrupting chemicals); however, the impact of non-monotonicity on exposure selection has not been evaluated. In a simulation study, we assessed the performance of Bayesian kernel machine regression (BKMR), Bayesian additive regression trees (BART), Bayesian structured additive regression with spike-slab priors (BSTARSS), and lasso penalised regression. We used data on exposure to 12 phthalates and phenols in pregnant women from the U.S. National Health and Nutrition Examination Survey to simulate realistic exposure data using a multivariate copula. We simulated datasets of size N = 250 and compared methods across 32 scenarios, varying by model size and sparsity, signal-to-noise ratio, correlation structure, and exposure-response relationship shapes. We compared methods in terms of their sensitivity, specificity, and estimation accuracy. In most scenarios, BKMR and BSTARSS achieved moderate to high specificity (0.56--0.91 and 0.57--0.96, respectively) and sensitivity (0.49--0.98 and 0.25--0.97, respectively). BART achieved high specificity (≥ 0.96), but low to moderate sensitivity (0.13--0.66). Lasso was highly sensitive (0.75--0.99), except for symmetric inverse-U-shaped relationships (≤ 0.2). Performance was affected by the signal-to-noise ratio, but not substantially by the correlation structure. Penalised regression methods that assume linearity, such as lasso, may not be suitable for studies of environmental chemicals hypothesised to have non-monotonic relationships with outcomes. Instead, BKMR and BSTARSS are attractive methods for flexibly estimating the shapes of exposure-response relationships and selecting among correlated exposures.

READ FULL TEXT

page 18

page 22

research
10/13/2020

Treed distributed lag non-linear models

In studies of maternal exposure to air pollution a children's health out...
research
01/13/2021

Bayesian Multiple Index Models for Environmental Mixtures

An important goal of environmental health research is to assess the risk...
research
04/25/2019

Bayesian Factor Analysis for Inference on Interactions

This article is motivated by the problem of inference on interactions am...
research
12/06/2021

Analyzing Highly Correlated Chemical Toxicants Associated with Time to Pregnancy Using Discrete Survival Frailty Modeling Via Elastic Net

Understanding the association between mixtures of environmental toxicant...
research
01/30/2023

Incorporating prior information into distributed lag nonlinear models with zero-inflated monotone regression trees

In environmental health research there is often interest in the effect o...
research
03/31/2022

Integrating Biological Knowledge in Kernel-Based Analyses of Environmental Mixtures and Health

A key goal of environmental health research is to assess the risk posed ...
research
06/12/2020

Reflection on modern methods: Good practices for applied statistical learning in epidemiology

Statistical learning (SL) includes methods that extract knowledge from c...

Please sign up or login with your details

Forgot password? Click here to reset