Powerful genome-wide design and robust statistical inference in two-sample summary-data Mendelian randomization

04/19/2018
by   Qingyuan Zhao, et al.
0

Mendelian randomization (MR) uses genetic variants as instrumental variables to estimate the causal effect of risk exposures in epidemiology. Two-sample summary-data MR that uses publicly available genome-wide association studies (GWAS) summary data have become a popular design in practice. With the sample size of GWAS continuing to increase, it is now possible to utilize genetic instruments that are only weakly associated with the exposure. To maximize the statistical power of MR, we propose a genome-wide design where more than a thousand genetic instruments are used. For the statistical analysis, we use an empirical partially Bayes approach where instruments are weighted according to their true strength, thus weak instruments bring less variation to the estimator. The final estimator is highly efficient in the presence of many weak genetic instruments and is robust to balanced and/or sparse pleiotropy. We apply our method to estimate the causal effect of blood lipids on coronary artery disease. In our primary analysis, the estimated odds ratio (95 LDL cholesterol, HDL cholesterol, and triglycerides are 1.61 (1.45 -- 1.80), 0.82 (0.73 -- 0.91), and 1.00 (0.84 -- 1.21), respectively. Compared to previous MR studies, these numbers are closer to observational epidemiology estimates and much more precise. We also discuss diagnostics of the modeling assumptions and caveats of the results. By employing a genome-wide design and robust statistical methods, the statistical power of MR studies can be greatly improved. Unlike previous MR studies which all reported null findings for the HDL cholesterol, our results give support to the much debated HDL hypothesis.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset