Unveiling Challenges in Mendelian Randomization for Gene-Environment Interaction
Many diseases and traits involve a complex interplay between genes and environment, generating significant interest in studying gene-environment interaction through observational data. However, for lifestyle and environmental risk factors, they are often susceptible to unmeasured confounding factors and as a result, may bias the assessment of the joint effect of gene and environment. Recently, Mendelian randomization (MR) has evolved into a versatile method for assessing causal relationships based on observational data to account for unmeasured confounders. This approach utilizes genetic variants as instrumental variables (IVs) and aims to offer a reliable statistical test and estimation of causal effects. MR has gained substantial popularity in recent years largely due to the success of large-scale genome-wide association studies in identifying genetic variants associated with lifestyle and environmental factors. Many methods have been developed for MR; however, little work has been done for evaluating gene-environment interaction. In this paper, we focus on two primary IV approaches: the 2-stage predictor substitution (2SPS) and the 2-stage residual inclusion (2SRI), and extend them to accommodate gene-environment interaction under both the linear and logistic regression models for the continuous and binary outcomes, respectively. Extensive simulation and analytical derivations show that finding solutions in the linear regression model setting is relatively straightforward; however, the logistic regression model is significantly more complex and demands additional effort.
READ FULL TEXT