Inferring the Direction of a Causal Link and Estimating Its Effect via a Bayesian Mendelian Randomization Approach

12/18/2020 ∙ by Ioan Gabriel Bucur, et al. ∙ Radboud Universiteit 0

The use of genetic variants as instrumental variables - an approach known as Mendelian randomization - is a popular epidemiological method for estimating the causal effect of an exposure (phenotype, biomarker, risk factor) on a disease or health-related outcome from observational data. Instrumental variables must satisfy strong, often untestable assumptions, which means that finding good genetic instruments among a large list of potential candidates is challenging. This difficulty is compounded by the fact that many genetic variants influence more than one phenotype through different causal pathways, a phenomenon called horizontal pleiotropy. This leads to errors not only in estimating the magnitude of the causal effect but also in inferring the direction of the putative causal link. In this paper, we propose a Bayesian approach called BayesMR that is a generalization of the Mendelian randomization technique in which we allow for pleiotropic effects and, crucially, for the possibility of reverse causation. The output of the method is a posterior distribution over the target causal effect, which provides an immediate and easily interpretable measure of the uncertainty in the estimation. More importantly, we use Bayesian model averaging to determine how much more likely the inferred direction is relative to the reverse direction.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Finding and estimating causal effects from various measured exposures (phenotypes, biomarkers, risk factors) to disease and health-related outcomes is a crucial task in epidemiology. If we knew there was a causal link from a modifiable exposure (e.g., the level of LDL-cholesterol) to an outcome of interest (e.g., the incidence of cardiovascular disease), we could intervene on the former to produce a desired effect on the latter. Moreover, if we knew the magnitude of the causal effect, we could then predict the efficacy of treating the disease by altering the level of the exposure.

Randomized controlled trials (RCTs) are the best means of establishing the causal relation between exposure and outcome. Fletcher et al. (2012) In epidemiological research, RCTs have been used, for example, to identify causal risk factors or to determine the efficacy of a new drug. However, it is often not practical or ethical to perform an RCT, in which case we want to be able to extract causal information from observational epidemiological studies. Directly inferring causation from observational data is problematic since observed correlations (associations) do not imply any particular causal relationship on their own. The correlation between two variables and can be explained by the existence of a causal link from and (direct causation) or from to (reverse causation), but it can also be explained by an unobserved common cause (confounder) influencing the two variables simultaneously. It can even be the result of a combination of causal and confounding effects.

One method for finding and estimating the magnitude of causal effects from observational data that has recently gained traction in the field of epidemiology is Mendelian randomization (MR). In MR, germline genetic variants are used as proxy variables for environmentally modifiable exposures with the goal of making causal inferences about the outcomes of the modifiable exposure. Lawlor et al. (2008) The basic principle behind Mendelian randomization is that the differences between individuals resulting from genetic variation are not subject to the confounding or reverse causation biases that distort observational findings. Davey Smith et al. (2017) The natural “randomization” of alleles (variant forms) can thus be likened to the random allocation of treatment performed in RCTs in the sense that groups defined by the value of a relevant genetic variant will experience an on-average difference in exposure, while not differing with respect to confounding factors. Davey Smith and Hemani (2014)

Mendelian randomization has already proved very useful in extracting causal information from observational studies. MR has been used successfully for drug target validation, for predicting side effects of drug targets, for discovering causal risk factors in clinical practice, or for a better understanding of complex molecular traits. Zheng et al. (2017) In a classic Mendelian randomization study, Chen et al. Chen et al. (2008) examined the nature of the association between alcohol intake and blood pressure. In this example, the exposure variable (alcohol intake) is a modifiable behavior, but conducting a randomized controlled trial would be problematic for both ethical and practical reasons: alcohol consumption is potentially damaging to the subject’s health and measuring the intake is prone to error. Chen et al. Chen et al. (2008) employed the genetic variant ALDH2 as a surrogate for measuring alcohol intake (see Figure 2). ALDH2 encodes aldehyde dehydrogenase, a major enzyme involved in alcohol metabolism. The (*2*2) variant of the ALDH2 gene is associated with an accumulation of acetaldehyde in the body after drinking alcohol, and therefore unpleasant symptoms. Because of this, carriers of this particular variant tend to limit their alcohol consumption. Chen et al. Chen et al. (2008) found that the individuals possessing the (*2*2) variant also have lower blood pressure on average. The authors considered the possibility that the (*2*2) genotype influences blood pressure via a different causal pathway not mediated by alcohol intake (horizontal pleiotropy). However, they found no association between ALDH2 and hypertension in females, for whom drinking levels in any genotype group were similar (very low). If there were pleiotropic effects, an effect on blood pressure would be seen in both sexes. The findings of Chen et al. Chen et al. (2008) support the idea that alcohol intake increases blood pressure and the risk of hypertension, to a much greater extent than previously thought. Their results indicate that previous observational epidemiological studies suggesting the cardioprotective effects of moderate alcohol consumption might have been misleading.

Figure 1: Chen et al. Chen et al. (2008) used the ALDH2 genetic variant as a proxy for alcohol intake to determine if there is a causal link between the latter and blood pressure. They found that alcohol intake has a marked positive causal effect on blood pressure.
Figure 2: Ference et al. Ference et al. (2012) used Mendelian randomization for estimating the (positive) causal effect of low-density lipoprotein cholesterol (LDL-C) on the risk of coronary heart disease (CHD). They estimated that for each lower LDL-C, the risk of CHD is reduced by 54.5% (95% CI [48.8%, 59.5%]).

In a more recent study, Ference et al. Ference et al. (2012) performed an MR meta-analysis for the purpose of estimating the effect of long-term exposure to lower-plasma low-density lipoprotein cholesterol (LDL-C) on the risk of coronary heart disease (CHD). Unlike in the previous example, the causal link from LDL-C to CHD was already well established at the time of the study, but the magnitude of the (long-term) causal effect had not been reliably quantified (see Figure 2). To estimate the causal effect of LDL-C exposure on the risk of CHD, Ference et al. Ference et al. (2012) considered nine polymorphisms (genetic variations resulting in the occurrence of several different alleles at a locus) in six different genes. For each polymorphism, they computed a causal effect estimate using Mendelian randomization and then they combined the results to obtain a more precise estimate. Finally, they compared the estimated causal effect of long-term exposure to lower LDL-C with the clinical benefit resulting from the same magnitude of LDL-C reduction during treatment with a statin. Ference et al. Ference et al. (2012) found that Mendelian random allocation to lower LDL-C levels was responsible for a 54.5% reduction in CHD risk for each lower LDL-C. They concluded that prolonged (lifetime) exposure to lower LDL-C results in a threefold greater reduction in the risk of CHD compared to the clinical benefit of a statin-based treatment producing the same magnitude of LDL-C reduction.

Figure 3: Vimaleswaran et al. Vimaleswaran et al. (2013) explored the causal direction of the relationship between body mass index (BMI) and 25-hydroxivitamin D [25(OH)D] via bidirectional Mendelian randomization. They concluded that higher BMI leads to lower vitamin D levels and not the other way around.

Finally, Vimaleswaran et al. Vimaleswaran et al. (2013) investigated the causality and direction of the association between body mass index (BMI) and 25-hydroxivitamin D [25(OH)D]. In this example, it is not clear whether the larger vitamin D storage capacity of obese individuals (vitamin D is stored in fatty tissues) lowers their 25(OH)D levels or whether 25(OH)D levels influence fat accumulation. The latter is an example of reverse causation, which occurs when the outcome of interest precedes and leads to changes in the exposure instead of the other way around. Rothman et al. (2008) Reverse causation is an important source of bias in observational epidemiological studies, Davey Smith and Ebrahim (2003); Lawlor et al. (2008) which can lead to misinterpretation in the observed association with respect to the potential impact of interventions. Wade et al. (2018) In their work, Vimaleswaran et al. Vimaleswaran et al. (2013) used a bidirectional Mendelian randomization approach (see Figure 3), which implies performing an MR analysis in both directions, to distinguish between the two causal models. For this approach to work, it must be known on which phenotypic trait each genetic variant has a primary influence. Davey Smith and Hemani (2014) If vitamin D deficiency influences obesity, then the genetic variants primarily associated with lower 25(OH)D levels should also be associated with higher BMI. Conversely, if obesity leads to lower vitamin D levels, then the genetic variants primarily associated with higher BMI should also be related to lower 25(OH)D concentrations. Vimaleswaran et al. Vimaleswaran et al. (2013) concluded that higher BMI (obesity) leads to lower vitamin D levels and not the other way around. Their findings provided evidence for the role of obesity as a causal risk factor for vitamin D deficiency and suggested that interventions meant to reduce BMI are expected to decrease the prevalence of vitamin D deficiency, as later shown in intervention studies. Ibero-Baraibar et al. (2015); Gangloff et al. (2015)

In this paper, we introduce a Bayesian framework (BayesMR) that extends the Mendelian randomization approach to situations where the direction of the causal effect between the two phenotypes of interest is unknown or uncertain. In our approach, we do not need to know in advance which of the two phenotypes is primarily influenced by the genetic variants. In the experimental section, we look at both directions of the association between smoking and coffee consumption using the same genetic variants in an attempt to determine which direction is more likely. This flexibility enables the researcher to apply the instrumental variable approach to a much broader set of genetic variants. The rest of this paper is organized as follows. In Section 2, we discuss the assumptions behind Mendelian randomization and recent extensions to the approach. Then, we introduce our proposed Bayesian model in Sections 3 and 4. We perform a series of simulations in Section 5 to illustrate the strengths of our approach. In Section 6, we apply our BayesMR method to a couple of real-world problems where the standard Mendelian randomization approaches might produce biased results. We conclude by discussing the advantages and disadvantages of our proposed method as well as potential applications in Section 7.

2 Instrumental variables and Mendelian randomization

2.1 Instrumental variable assumptions

Figure 4: Directed acyclic graph (DAG) showing the instrumental variable assumptions. Causal effects are unidirectional and denoted by arrows. The association entailed by IV1 is highlighted in red, while the crossed-out dashed lines signify the absence of an association (causal or non-causal). The dotted border of means that the variable is unobserved.

Mendelian randomization is an approach in which genetic variants are used as instrumental variables for estimating the efffect of a modifiable phenotype or exposure on an outcome variable of clinical interest. A (valid) instrumental variable (IV) or instrument Bowden and Turkington (1990) is a factor which:

  • is associated with the exposure;

  • is independent of all confounders of the exposure-outcome association;

  • influences the outcome only through its effect on the exposure.

The three core IV assumptions are depicted by the directed acyclic graph (DAG) in Figure 4, where is the genetic variant used as an instrumental variable, is the variable measuring the exposure, is the variable measuring the outcome of interest, and is an unmeasured variable that encapsulates the potential confounding effects affecting the exposure-outcome association. Note that for the graphical model to be acyclic, there must be no feedback loops in the system. Acyclicity is a standard assumption in Mendelian randomization and implies that the causal link of interest is unidirectional. Didelez and Sheehan (2007) Assumption IV1 is represented in Figure 4 by the edge between and . Note that the association between the genetic variant and the exposure need not be causal, although this is often assumed in the context of MR. Assumption IV2 is denoted by a lack of paths from to . Finally, assumption IV3 is denoted by an absence of paths from to , except for the one that goes through . Genetic variants are particularly suitable as candidate instrumental variables because they are fixed at conception and “randomized” in a population according to Mendel’s laws of inheritance, meaning that their associations with non-genetic measures are generally not susceptible to confounding or reverse causation. Davey Smith and Ebrahim (2003); Zheng et al. (2017) In graphical terms, this is equivalent to removing all arrows from non-genetic variables ( in Figure 4) to genetic variables ( in Figure 4).

2.2 The Mendelian randomization pipeline

The suitability of using genetic variants as instrumental variables combined with the wide availability of data on genetic associations extracted by means of high-throughput genomic technologies have contributed to the increasing popularity of Mendelian randomization applications. Davey Smith and Hemani (2014); Visscher et al. (2017) The first step in Mendelian randomization is to find genetic variants that are robustly associated with the exposure under study. Bowden et al. (2015)

Ideally, MR is performed using a single variant whose biological effect on the exposure is well understood. In practice, however, such a single variant is not always available or known. Moreover, genetic variants typically only explain a small proportion of the variance in the exposure. 

Brion et al. (2013) When a single reliable genetic instrument is not available, it has become common to use multiple genetic variants in MR studies. Genome-wide association studies (GWAS) provide a rich source of potential instruments for MR analysis. Swerdlow et al. (2016) Increasing the number of exposure-related genetic variants will increase the proportion of variance explained by the instruments, which will lead to improved precision in the causal estimate and to increased power in testing causal hypotheses. Bowden et al. (2015) What’s more, the use of multiple genetic instruments provides opportunities to test for violations of the IV assumptions. Palmer et al. (2012)

Genetic variants are typically selected based on the size of the genetic effect on the exposure of interest and the specificity of the association. Swerdlow et al. (2016) After deciding on a set of relevant exposure-related genetic instruments, the next step is to examine if they are also associated with the disease outcome. If no genotype-outcome association is found, then it becomes likely that there is no causal link from exposure to outcome. Otherwise, by making additional assumptions, it is possible to compute a consistent estimate of the causal relationship’s strength. Didelez and Sheehan (2007) A commonly made assumption is that the causal relation between the exposure and the outcome is linear:

(1)

The strength or magnitude of this relationship is then given by the parameter , which is the causal effect of the exposure on the outcome. In an MR analysis, the estimate of is derived from the associations with the genetic instruments. For example, when using a single genetic variant as an instrumental variable, a consistent estimate of the causal effect under the linear model assumption (1) is Wald’s ratio Lawlor et al. (2008) (the Wald method Bowden et al. (2015)):

(2)

where is the coefficient of the regression of the outcome on the genetic variant and is the coefficient of the regression of the exposure on the genetic variant. When using multiple genetic instruments, the individual IV estimates can be combined similarly to performing a meta-analysis:

(3)

where

is the standard error of

and can be approximated using the delta method. Burgess et al. (2013) is known as the inverse-variance weighted

(IVW) estimator and is equivalent to a zero-intercept weighted linear regression of the genetic associations with the outcome on the genetic associations with the exposure. 

Bowden et al. (2015)

Unfortunately, when selecting multiple genetic variants for Mendelian randomization based on their association with the exposure, we cannot be certain that all of them are valid instrumental variables. In the worst possible case, none of the selected variants are valid instruments for the exposure-outcome associations. As a result, there is great interest in developing methods robust to the validity of the IV assumptions. Davey Smith and Hemani (2014); Zheng et al. (2017)

2.3 Challenges of Mendelian randomization

The application of MR methods requires genetic variants that are valid instruments for the exposure-outcome association, i.e., genetic variants satisfying the core IV assumptions. If a causal link from exposure to outcome exists, then a genetic variant associated with the exposure will also be associated with the outcome. Under the core IV assumptions, the converse also holds: an association between the genetic variant and the outcome implies the existence of a causal link from the exposure to the outcome. When these assumptions are violated, however, we can no longer be sure that a causal link from exposure the outcome exists. A discovered genetic association with the disease outcome could be produced via a different causal pathway which is not mediated by the exposure (see Figure 7). In this case, the bias in the causal effect estimate obtained through the MR approach is potentially unbounded. Cornia and Mooij (2014) Even worse, if we are dealing with reverse causation for the exposure-outcome association, then the causal link whose effect we are trying to estimate does not exist (see Figure 7).

Figure 5: The Local Causal Discovery Cooper (1997) (LCD) pattern can be discovered from observational data by testing if the genetic variant is independent from the outcome when conditioning on the exposure .

Unfortunately, only assumption IV1 is directly testable from data. It is typically fulfilled by selecting only those genetic variants that are associated with an exposure of interest. In general, the validity of assumptions IV2 and IV3 cannot be established from the data, since they involve the unobserved confounder Didelez and Sheehan (2007) However, if the exposure-outcome association is unconfounded (see Figure 5), then we can check the validity of IV2 and IV3 by performing a conditional independence test. In this particular scenario, the genetic variant becomes independent of the disease outcome after conditioning on the exposure variable. This principle of testing the IV assumptions by looking for a conditional independence is used to orient the causal relationship between the two traits in mediation-based approaches Hemani et al. (2017) such as the likelihood-based causality model selection Schadt et al. (2005) (LCMS), the regression-based causal inference test Millstein et al. (2009) (CIT) or the Findr Wang and Michoel (2017)

algorithm. In the artificial intelligence literature, this particular model is called the

Local Causal Discovery Cooper (1997) (LCD) pattern. The testability of the LCD pattern has been exploited, for example, in the Trigger Chen et al. (2007) algorithm for the purpose of learning the structure of gene regulatory networks.

Assumption IV2 is not completely testable since there can always be unmeasured confounding variables for which we cannot account, but we can at least filter out the genetic variants associated with any potential confounding traits that have been measured. Yin and Voight (2015) Fortunately, it is not very likely that genetic variants are associated with confounders of the exposure-outcome association. Davey Smith et al. (2007) The assumption that is most likely to be violated is IV3, which states that the instrumental variable is only related to the outcome via its effect on the risk factor. Lawlor (2016) Assumption IV3 does not hold if there are alternative causal pathways between the genetic variant and the disease outcome that are not mediated by the exposure (see Figure 7). This often happens when a genetic variant influences more than one phenotypic traits, a property called pleiotropyvan Kippersluis and Rietveld (2017)

As the GWAS sample sizes continue to grow, the number of candidate genetic variants to be used as potential instruments is also increasing. Unfortunately, more and more of the supplied candidates are likely to be invalid instruments. Hemani et al. Hemani et al. (2017) show that horizontal pleiotropy and reverse causation can easily lead to invalid instrument selection in MR and suggest that the presence of horizontal pleiotropy should be considered the rule rather than the exception. This severely limits the application of Mendelian randomization to cases where previous background knowledge can be used to validate the core assumptions. By relaxing (some of) these assumptions, however, the MR approach can be applied to a much broader range of genetic variants.

2.3.1 Pleiotropy

In the genetics of complex diseases it is common for genetic variants to affect multiple disease-related phenotypes. Sivakumaran et al. (2011); Solovieff et al. (2013) If such a pleiotropic genetic variant affects the disease outcome through different causal pathways, then IV2 or IV3 or both are potentially violated. This makes pleiotropy problematic for MR studies. Didelez and Sheehan (2007); Davey Smith and Hemani (2014) As a result, many recent extensions of the MR approach have been aimed at producing better causal estimates in the presence of pleiotropic effects. Bowden et al. (2015, 2016); Hartwig et al. (2017); Li (2017); van Kippersluis and Rietveld (2017); Burgess et al. (2017); Berzuini et al. (2018) Pleiotropy can be divided into horizontal (biological) pleiotropy and vertical (mediated) pleiotropy. Horizontal pleiotropy occurs when the same genetic variant affects multiple trait via different biological pathways, whereas vertical pleiotropy occurs when the same genetic variant is associated with multiple traits found on the same biological pathway. We will only focus on the former, since only horizontal pleiotropy can lead to violations of IV3, and refer to it hereafter as ‘pleiotropy’, for brevity. van Kippersluis and Rietveld (2017)

Two main approaches have been used for robustifying Mendelian randomization against potentially pleiotropic genetic variants. The first approach is to replace some of the (untestable) IV assumptions, e.g., IV2 or IV3 or both, with a weaker set of alternative assumptions. The second approach is to assume that the IV assumptions hold for some, but not necessarily for all the genetic variants. In this case, the invalid genetic instruments are allowed to violate the IV assumptions in an arbitrary way. Burgess and Thompson (2017)

MR-Egger regression Bowden et al. (2015) is a pleiotropy-robust method that falls into the first category. MR-Egger is an extension to the IVW method introduced in Subsection 2.2, in which an intercept term is added to the weighted linear regression of the genetic associations with the outcome on the genetic associations with the exposure. While IVW is consistent if the pleiotropic effects average to zero across the genetic variants, this extension allows for consistency even when the pleiotropic effects do not average to zero. Both the IVW and MR-Egger methods work under the InSIDE assumption, which stipulates that the association with the exposure is independent from the direct (pleiotropic) effect on the outcome across different genetic variants. Burgess and Thompson (2017) Another robust alternative to the IVW estimator is the weighted median estimator Bowden et al. (2016) (WME), for which we compute a weighted median of the IV estimates instead of a weighted average. The WME is a consistent estimate of the true causal effect as long as at least 50% of the genetic instruments are valid. In a similar vein, Kang et al. Kang et al. (2016) have introduced a LASSO-type procedure to identify the valid instruments from within a set of candidate genetic variants. Hartwig et al. Hartwig et al. (2017) have proposed instead to compute the weighted mode of the individual IV estimates to obtain the mode-based estimate (MBE). The MBE can be a consistent estimate of the true causal effect even when fewer than 50% of the genetic instruments are valid. However, the application of the method relies on the assumption that the largest number of similar individual IV estimates comes from valid instruments, an assumption termed the Zero Modal Pleiotropy Assumption (ZEMPA). Finally, Hemani et al. Hemani et al. (2017)

suggest combining these methods (along with others) in a “mixture of experts” machine learning approach. 

Bishop (2006)

A number of Bayesian modeling approaches for dealing with pleiotropic genetic variants have been proposed, in which all variants used as instruments are allowed to exhibit pleiotropic effects on the outcome. Li Li (2017) introduced empirical Bayes hierarchical models for estimating the causal effect in MR studies where many instruments are invalid, while Berzuini et al. Berzuini et al. (2018) performed a fully Bayesian treatment of the Mendelian randomization problem. In both cases, the authors handled pleiotropy by putting shrinkage priors on the pleiotropic effects, thereby forcing them to take values close to zero unless there is strong posterior evidence for pleiotropy. Burgess et al. (2018) Thompson et al. Thompson et al. (2017), on the other hand, modeled pleiotropy explicitly and used Bayesian model averaging to provide estimates that allow for uncertainty regarding the nature of pleiotropy.

2.3.2 Reverse causation

A crucial aspect that is often overlooked in Mendelian randomization studies is the implicit assumption made regarding the direction of the causal link between the exposure and the outcome. While it is sometimes possible to rule out reverse causation, in most situations there is not enough background knowledge available about the underlying causal mechanism. The observed relationship between low levels of LDL cholesterol and risk of cancer is an example where we cannot a-priori assume a causal direction. It may be that low levels of LDL cholesterol are causal for cancer, but it is also possible that the presence of the disease has a negative effect on LDL cholesterol. Zheng et al. (2017) In the case of complex systems such as gene regulatory networks it is also unclear in which direction the gene regulation occurs. Aten et al. (2008)

Figure 6: The genetic variant is pleiotropic since it affects both phenotypic measures ( and ). In this illustration, exhibits both vertical pleiotropy ( and are influenced by via the same causal pathway ) and horizontal pleiotropy ( and are influenced by via different causal pathways). The former is crucial to the application of MR, while the latter results here in a violation of the IV3 assumption.
Figure 7: Reverse causation refers to the situation when the outcome precedes and causes the exposure instead of the other way around. Flegal et al. (2011) The term ‘reverse’ refers to the fact that the direction of the causal effect is opposite to what was expected based on the study design. We emphasize that although we denote the exposure by and the outcome by , these terms do not have any causal meaning, i.e., they do not imply a particular causal ordering

.

The existence of alternative causal paths from a genetic variant to the outcome of interest affects not only the estimation of causal effects, but also model identification. If there exists a causal pathway from the genetic variant to the outcome that is not mediated by the exposure, then an exposure-outcome causal link is no longer required to explain their association (see Figure 7). In fact, the causal link between the two traits may have the opposite direction (see Figure 7). Ignoring the possibility of reverse causation for the exposure-outcome association is potentially dangerous. If the true causal link is in the expected direction (from to , Figure 7), we will be underestimating the uncertainty of our estimate by not considering the alternative model. If the true causal link is in the opposite direction (from to , Figure 7), then the MR approach will produce a misleading result, as we are trying to estimate the wrong causal effect. Bidirectional Mendelian randomization Davey Smith and Hemani (2014) is one approach used to distinguish between an “exposure” having a causal effect on an “outcome” and the reverse. The idea is to perform MR in both directions to tease apart the causal relationships and Zheng et al. (2017) To make use of this approach, it is essential to know if a genetic variant primarily influences the “exposure” or the “outcome”. However, this can be difficult to assess when utilizing variants with little or no understanding of their biological effects. Davey Smith and Hemani (2014)

Agakov et al. Agakov et al. (2010) proposed a Bayesian framework built on sparse linear methods developed for regression, which is similar in spirit to BayesMR. In their approach, they first produce a sparse representation of the data by putting a Laplace prior on the linear coefficients and performing MAP inference. They then compute the evidences (marginal likelihoods) using the Laplace approximation for the direct, reverse, and pleiotropic models, with and without latent confounders. This sparse instrumental variables (SPIV) approach has been used to infer the direction of causality between vitamin D levels (plasma 25-hydroxyvitamin D) and colorectal cancer. Zgaga et al. (2013) More recently, Hemani et al. Hemani et al. (2017) have suggested using the Steiger Z-test Steiger (1980) of correlated correlations for elucidating the direction of the causal link. Their MR Steiger Hemani et al. (2017) approach is based on the simple principle that in most circumstances the genetic variant will exhibit stronger correlation with the trait located upstream in the causal chain.

2.4 Summary

Mendelian randomization is a powerful principle that can help strengthen causal inference by combining observational data with genetic knowledge. However, MR methods require good candidate genetic variants to be used as instruments, which often limits their application to variants with well understood biology. The wide spread of pleiotropic genetic variants, which violate key IV assumptions (IV2 and especially IV3), is a particularly troublesome issue. Davey Smith and Hemani (2014); Hemani et al. (2018) Many methods have been designed to improve causal effect estimation in the presence of pleiotropic effects, Li (2017); Thompson et al. (2017); Berzuini et al. (2018) but they do not incorporate the possibility of reverse causation. At the same time, a number of methods have been developed for inferring the correct direction of the causal link between two phenotypes, Hemani et al. (2017, 2017) but these do not provide a measure of uncertainty in the model selection. In the following section, we introduce a general method with which we aim to handle all of these issues simultaneously. We first perform Bayesian model averaging to account for the uncertainty in the direction of the causal link and then, for each model, we obtain an informative posterior distribution for the causal effect of interest.

3 Bayesian model specification

3.1 Setup

Figure 8: Graphical description of our assumed generative model. We denote the exposure variable by and the outcome variable by . We are interested in the causal effect from to , which is denoted by . The association between and is obfuscated by the unobserved variable , which we use to model unmeasured confounding explicitly. The shaded plate indicates replication across the independent genetic variants . Note that the replication also applies to the parameters and and their corresponding edges.

We assume that the data is generated from the model illustrated in the causal diagram from Figure 8, where all relationships between variables are linear with no effect modification. We consider independent genetic variables, which we denote by , . Their genetic variation is given by the allele (variant form) found at a specific locus. The two allele copies inherited at a specific locus form the genotype of an individual. Here we only consider binary polymorphisms, i.e., each allele has two possible forms. Lawlor et al. (2008) For convenience will refer to these forms as reference alleles and effect alleles, without assigning any biological meaning to the terms. Each variable can take the values 0, 1 or 2 corresponding to the number of effect alleles, as opposed to reference alleles, at the locus. Their corresponding effect allele frequencies (EAF) are denoted by , .

The generative model from Figure 8 is described by the following linear structural equation system:

(4)

where

is the random vector containing the

independent variables , is the corresponding vector of instrument strengths, while the vector collects the potential pleiotropic effects of the genetic variants. represents the putative causal effect of the exposure on the outcome , which is what we wish to estimate. and represent the effect of the confounder on and , respectively. Finally, and are the average values of the exposure and outcome variables, while and represent independent measurement error.

In this work, we restrict our attention to the case where the exposure and outcome variables are continuous. We also assume without loss of generality that . We denote the variance of the intrinsic error terms by and . Without loss of generality, we can fix the mean of the confounding variable and of the error terms to zero, i.e., . Also w.l.o.g., we can fix the variance of the confounding variable to one () by appropriately rescaling the confounding coefficients and .

3.2 Model assumptions

The first two core IV assumptions are implied by our proposed model. The existence of direct edges from each to the exposure implies that IV1 is satisfied for each genetic variant. At the same time, we assume that for any we have , where we use the notation Dawid (1979) to denote that is conditionally independent of given . These independence statements signify that the genetic variants are not correlated with any confounding variables (IV2). On the other hand, we allow for unrestricted violations of assumption IV3, as long as they do not lead to violation of IV2. We summarize the contribution of the pleiotropic effects in our system with a single directed edge between each genetic variant and the outcome variable (See Figure 8). The strength of the pleiotropic effect is denoted by . A genetic variant is a valid instrumental variable only if its direct effect on the outcome is exactly zero, i.e., .

In our model, we assume that the instrument strengths and the pleiotropic effects are independent a-priori. Berzuini et al. have termed this Instrument Effects Orthogonality (IEO), Berzuini et al. (2018) which is the Bayesian interpretation of the Instrument Strength Independent of Direct Effect (InSIDE) assumption. Bowden et al. (2015) This relaxation of the stronger IV3 assumption is frequently used in pleiotropy-robust MR methods Zheng et al. (2017); Verweij et al. (2018) and is in line with the principle of independent (causal) mechanisms. Peters et al. (2017) Here, we distinguish between the causal mechanism describing the direct association between the genetic variants and , which is parametrized by , and the causal mechanism describing the direct (not through ) association between the genetic variants and , which is parametrized by .

We also assume that the genetic variants are independent, i.e., , although our model could be easily extended to handle correlated genetic variants by introducing additional parameters. This means that we do not allow for the possibility of linkage disequilibrium, where one genetic variant can potentially have a causal effect on another related genetic variant. We do not assume, however, that the genetic variants primarily influence one phenotype or the other, as required for instance in the application of standard bidirectional Mendelian randomization. This enables us to use the same set of instruments and the same prior distributions for both the forward and reverse analyses, as the variables and become essentially interchangeable.

3.3 Likelihood

We assume that the data generating process is described by equation system (4) and that each data sample is a complete set of observed values for the exposure , the outcome and the genetic variants . The input data set consists of i.i.d. samples of . Our model parameters consist of:

  • – the structural coefficients denoting the direct causal effects from to (instrument strengths), from to (the target causal effect), and from to (pleiotropic effects), respectively;

  • – the scale parameters of the intrinsic error terms and ;

  • – the coefficients corresponding to the confounding effect of .

We collect all parameters of the model in the vector .

We model each genetic variable

as binomially distributed, where we count the number of effect alleles at the genetic locus, either 0, 1 or 2. If

is the EAF of , then the binomial distribution has parameters and , so . In our model, the intrinsic error terms of and

are independent and follow a Gaussian distribution:

. The confounding variable

follows a standard normal distribution:

. Note that we can set the variance of to one without any loss of generality by appropriately rescaling the confounding coefficients and . We obtain a linear-Gaussian model conditional on the value of the genetic variants, in the same vein as Jones et al., Jones et al. (2012) Li, Li (2017) or Berzuini et al.: Berzuini et al. (2018)

(5)

Given the conditional linear-Gaussian model, the log-likelihood is equal, up to constant terms, to:

(6)

where . We use to denote the -th observed value of the genetic vector, exposure variable, and outcome variable, respectively.

Both Berzuini et al. Berzuini et al. (2018) and Li Li (2017) consider the same linear-Gaussian model, but use a slightly different parametrization: . The difference between our approach and theirs is that we model the confounding explicitly by defining the linear confounding effects and , while they model the confounding implicitly by defining a nonzero covariance parameter for . In other words, using the terminology from Jones et al. Jones et al. (2012), we employ a “full model” parametrization, while Li Li (2017) and Berzuini et al. Berzuini et al. (2018) consider a parametrization equivalent to the “correlated errors model”. In both the full and correlated errors model, the likelihood of the observables given is bivariate normal. Jones et al. (2012) Thompson et al. Thompson et al. (2017) also assume a linear-Gaussian model, but approximate the discrete distribution over the set of genetic variants with a continuous normal distribution.

3.4 Priors

According to Jones et al., Jones et al. (2012) informative priors are crucial to the success of Bayesian MR methods, regardless of whether or not they are identifiable, and so-called ‘vague’ (uninformative) priors should be avoided whenever possible. Thompson et al. Thompson et al. (2017) mention two arguments against the use of an uninformative prior in the Mendelian randomization context: 1. that the analysis is already subjective due to the selection of the genetic variants and; 2. the data alone is sufficient to produce useful estimates of the causal effect. However, care must be taken when specifying informative priors, as substantial variation in parameter estimates can occur as the prescribed priors are modified.

For most genetic variants, it is unrealistic to expect exactly zero pleiotropy, i.e., that the effect on the outcome variable is completely mediated by the exposure. Verbanck et al. (2018) However, we can differentiate between pleiotropic effects that are weak and strong relative to the other interactions. A weak pleiotropic effect will not introduce significant bias, and therefore still enable us to produce a useful estimate of the causal effect, in spite of the violation of IV3Zhu et al. (2018) For this reason, in choosing the prior for our structural parameters, we start from the assumption that the interactions between variables are either ‘strong’ (relevant) or ‘weak’ (irrelevant), in a similar vein to the considerations made by Berzuini et al. Berzuini et al. (2018) This is different from the standard assumption that the pleiotropic effects are either zero or nonzero, as used for example in the LASSO-based method of Kang et al. Kang et al. (2016) A reasonable choice for the prior (of ) is then a scale mixture of two Gaussian distributions centered at zero:

(7)

where and . In the limit of , we obtain the spike-and-slab prior. Ishwaran and Rao (2005) This Gaussian mixture prior has a straightforward interpretation, in which the component with higher variance is meant to capture the ‘strong’ interactions, while the component with lower variance is meant to capture the ‘weak’ interactions. It is informative in the sense that we expect some of the interactions in our data to be negligible, though not exactly zero, and therefore encourage the exclusion of irrelevant nonzero effects. George and McCulloch (1993) For example, we might expect that some (most) of the genetic variants do not have significant pleiotropic effects, but exhibit some weak unmediated residual interaction.

Our Gaussian mixture prior induces model parsimony by giving a preference for the model with the smallest number of strong parameters, which will then allow us to choose the most parsimonious causal explanation (direct causation, reverse causation, no causation). Bucur et al. (2017) Moreover, since every interaction is shrunk selectively, in the sense that smaller coefficient estimates are shrunk towards zero more sharply than larger coefficients, a significant gain in efficiency can be achieved. Zhao et al. (2018) Other shrinkage priors, such as the horseshoe employed by Berzuini et al. Berzuini et al. (2018), have also been used successfully for Bayesian MR. However, we have chosen the spike-and-slab prior due to ease of implementation and interpretation. In our experiments, we found that nested sampling works better when we specify the structural parameter (marginal) prior density directly instead of using a hierarchical approach. Consequently, we found it helpful that the spike-and-slab prior density is simply a mixture of Gaussian densities, as opposed to the horseshoe prior density which has no closed-form expression. Carvalho et al. (2009)

3.4.1 Rescaling

To make our method independent of the measured variables’ scale, we rescale our model parameters as follows. We divide the first equation in (4) by , the scale of the first error term , and we divide the second equation by , the scale of the second error term :

(8)

The standardized variables

are dimensionless quantities. The parameters of the model are also appropriately rescaled and become dimensionless:

We then put priors on the set of rescaled parameters .

3.4.2 Choice of priors

  • A Gaussian mixture prior suitably reflects the uncertainty in the pleiotropy of a genetic variant. If the pleiotropic effect is weak, then the induced bias for computing the causal effect is very small, even though the exclusion restriction (IV3) assumption is violated. On the other hand, if the pleiotropic effect is strong, we would like to correct for this bias. Our model incorporates both situations naturally by assigning a Gaussian mixture (spike-and-slab) prior to the (rescaled) pleiotropic interaction. If the interaction is weak (irrelevant), then it will be captured by the ‘spike’. If the interaction is strong (relevant), then it will be captured by the ‘slab’.

  • The genetic variants are often preselected using the criterion of association with the exposure of interest, which suggests that the instrument strength is nonzero. A normal prior with large enough variance would be appropriate for this parameter. If we have prior knowledge that the instrument is weak, we can appropriately choose a smaller variance for the prior. It is also possible that variants selected for the association with the exposure are actually more strongly related (upstream) to the outcome. The strength of the genotype-exposure association could then be seen as ‘weak’ or ‘irrelevant’ compared to the genotype-outcome association. With genome-wide association studies growing ever larger, the statistical power to detect significant associations that may be influencing the trait downstream of many other pathways increases, Hemani et al. (2017) in which case a Gaussian mixture prior could also be appropriate.

  • The mixture of Gaussians prior is also suitable for the exposure-outcome causal effect, since it is possible that the dependence is explained by pleiotropy and not by a causal effect . Because we do not know a-priori if we should expect a significant causal effect, we fix the weights of the mixture to , which corresponds to an indifference prior. Ishwaran and Rao (2005)

  • We also put a Gaussian mixture prior on the confounding coefficients to express the uncertainty with regard to the presence of confounding.

  • For the scale parameters, we choose a weakly informative, but proper half-Gaussian prior with a large standard deviation, as suggested by Gelman. 

    Gelman (2006)

3.4.3 Prior hyperparameters

When we do not know if the (potentially pleiotropic) genetic variants we intend to use for Mendelian randomization are valid or not, it is unclear whether their pleiotropic effects are more likely to be ‘large’ or ‘small’. To express this uncertainty, we define the hyperparameter

, corresponding to the weight of the Gaussian mixture prior for the pleiotropic effects and assign to it an uninformative uniform hyperprior:

We can make a similar argument for the strength of the association between the genetic variants and the exposures, i.e., the ‘instrument strengths’. Analogously, we obtain the hierarchical prior:

4 Bayesian model averaging and inference

4.1 Computing the model evidence

The Bayesian view of model comparison involves the use of probabilities to represent uncertainty in the choice of model. 

Bishop (2006)

We define the prior probability of a model

to express our preference for different models. We then wish to compute the posterior distribution

which will give us the probability of the model given the data. Bayesian model comparison and selection is a challenging problem, as it involves computing the Bayesian model evidence, also called the marginal likelihoodBishop (2006)

(9)

The model evidence can be viewed as the probability of generating the data set from a model whose parameters are sampled at random from the prior, and therefore expresses the preference shown by the data for different models. Evaluating the model evidence involves marginalizing the likelihood-prior product over all the parameters of the model (see Equation (9)). With the exception of a few special cases, this marginalization is analytically intractable and can potentially become very difficult to compute as the number of parameters increases. Fortunately, there are sampling schemes such as nested sampling Skilling (2006) that are designed to accurately compute the model evidence.

When performing model comparison, it is convenient to work with probability ratios. The prior odds ratio expresses the preference we give to model over model before looking at any of the data. The ratio of evidences for the two competing models, , is called the Bayes factorBishop (2006)

Combining the prior odds with the Bayes factor, we get the

posterior odds ratio:

In our approach, we compare two competing models corresponding to the two possible causal orderings for the non-genetic variables: and . In the first model (Figure 8(a)), which we term , the causal relationship is in the expected direction, i.e., from exposure to outcome. In the alternate model (Figure 8(b)), which we term , the causal relationship is in the opposite direction, which means we are dealing with reverse causation for the exposure-outcome association. Mendelian randomization rules out any other orderings for and , as the genotype assignment must have temporally preceded the other two variables.

(a) Model where the causal relationship is in the expected direction. is the parameter measuring the (linear) causal effect of the exposure on the outcome . The shaded plate indicates replication across the variants .
(b) Alternative model, where the causal relationship is in the reverse direction, from outcome to exposure. Note that is a different parameter than . The latter is equal to zero for this model, as there is no causal link from to .
Figure 9: Competing models encompassing both possible directions for the causal link between exposure and outcome.

The marginal likelihood for these models is also difficult to compute, with the Gaussian mixture prior on the parameters leading to a multimodal posterior. To solve our problem, we have used the PolyChord Handley et al. (2015, 2015) algorithm, which employs a nested sampling scheme, for computing the (log-)evidences of the two models. We then estimate the causal effect in both directions and combine the results by weighting them appropriately with the model evidence. This combined result properly reflects the uncertainty of not knowing the direction of the causal relationship.

If we have background knowledge that can be used to express a prior preference for one of the two models, we can easily incorporate it into our framework. We simply have to change the prior probabilities for each model. For example, if we know that precedes temporally, we can eliminate the possibility of reverse causation. In that case, , and hence , so we are only considering the ‘expected direction’ model. This is equivalent to the implicit assumption that the causal link is from exposure to outcome, which is often made in MR studies.

4.2 Posterior inference

Figure 10:

Compact description of the Bayesian inference process using plate notation. The small rectangles indicate our model parameters, while the circles denote random variables (observed or unobserved). The gray superposed areas signify replication across the

genetic variants and the data points. This is also suggested by the subscripts used for parameters and variables.

We take as input the complete data set of i.i.d. observations or the sample covariance matrix , which is a sufficient statistic for our model (see Subsection 6.1). As discussed in the previous section, we fit the model described in Figure 8 twice by considering both directions for the causal link between and , namely (Figure 8(a)) and (Figure 8(b)). We first assume a causal link from exposure to outcome and perform Bayesian inference to derive the posterior distribution of . We then consider the reverse direction of causation and perform Bayesian inference to obtain the posterior distribution of .

The Bayesian inference process is described in Figure 10 for the expected direction of the causal link. For the reverse direction, we simply switch the positions of and . Given our choice of prior distributions for the rescaled parameters (see Subsection 3.4), we draw samples from the full posterior distribution:

where . After sampling, we revert our parameter of interest to the original scale:

Finally, we compute the evidence for both directions, i.e., and , and combine the posterior distributions over the parameters:

(10)

where is the Dirac delta function centered at zero.

4.3 Example: instrumental variable setting

We first consider a simple generating instrumental variable model with one genetic instrument (see Figure 12) to illustrate the model comparison and averaging procedure. In this example, the instrument strength , the causal effect and the confounding effect are of the same size (). We have purposefully chosen unrealistic structural parameter values for ease of illustration. Note, however, that the parameters can be scaled appropriately to reflect more realistic values. The genetic variant has effect allele frequency . Assuming follows a binomial distribution with two allele copies, then . We also set and . We generated 10000 samples from this model, resulting in the following sample covariance and correlation over the observed variables:

Figure 11: Generating model where the causal relationship is from to (the expected direction). This model satisfies all three instrumental variable assumptions.
Figure 12: Alternative generating model, statistically equivalent to the one in Figure 12, where the causal relationship is from to (the reverse direction).

We now fit models (Figure 8(a)) and (Figure 8(b)) to the generated data and compute their log-evidence using a mixture Gaussian prior on each of the rescaled structural parameters (), as described in Section 3. We fixed the hyperparameters of the Gaussian mixture prior to for each structural parameter. We obtained a Bayes factor of in favor of the expected direction, which means that the data is 1.55 (95% CI [0.959, 2.505]) times more likely to have been generated from than . If we have no a-priori preference for over or vice versa, then the prior odds are and the corresponding posterior odds are

This result is perhaps surprising at first glance, as it implies that there is at least a one in three chance that the causal relationship is in the reverse direction. However, when we allow for pleiotropy, it becomes possible to fit a similarly sparse model where the edge is reversed. Such a model is illustrated in Figure 12. Note that data generated from either the model in Figure 12 or the one in Figure 12

have the same underlying probability distribution over the observed variables. When we incorporate the assumption that pleiotropic effects are ‘weak’ (close to zero), then the preference for

over becomes very strong. We can express this preference in the prior for , by setting in Equation (7), which is equivalent to putting a strongly informative (low variance) Gaussian prior on the size of the pleiotropic effect. Note that this a weaker assumption than IV3 (), which is equivalent to setting in the Gaussian mixture prior from Equation (7). After appropriately changing the prior of to reflect the assumption of ‘weak’ pleiotropy by setting the hyperparameter to 0.0 instead of 0.5, the Bayes factor estimate becomes 100.36 (95% CI [59.42, 169.51]), which translates into a probability that the model was given the data of less than 1%.

4.4 Example: near-conditional independence (near-LCD)

We now look at an example where the confounding and pleiotropic effects are weak relative to the instrument strength and to the exposure-outcome causal effect. The generating model is depicted in Figure 14. We call this example near-LCD, because if we made the confounding and pleiotropic effects infinitely weak, i.e., , we would arrive at the LCD pattern from Figure 5. Again, we generated 10000 samples from the model, resulting in the following sample covariance and correlation matrices over the observed variables:

Figure 13: Generating model where the causal relationship is from to () and in which there is weak confounding () and pleiotropy (). There is a weak dependence that results in a mild violation of the IV3 assumption.
Figure 14: Estimated posterior density of the causal effect for the data generated from the model in Figure 14. The dashed vertical line at zero indicates the estimated probability of the reverse direction (Figure 8(b)), in which case since there is no causal link from to .

For the prior hyperparameters , we obtained a Bayes Factor in favor of the expected direction of:

The confidence interval for the Bayes factor value (95% CI

) suggests that the preference for the expected direction is significant. This example shows that under mild violations of the IV3 assumption and given weak confounding effects, we can reliably indicate the correct causal direction using our method. Moreover, BayesMR appropriately computes the degree of uncertainty in the model selection, which increases for stronger confounding and/or pleiotropic effects. Finally, our approach results in reasonable posterior samples, which we show in Figure 14. We can see that most of the mass of the posterior is concentrated around the true value . The density has two components, given by the two possible causal directions. The dashed vertical line at zero indicates the estimated probability of the reverse direction, for which since there is no causal link from to .

5 Simulation studies

5.1 Estimation robustness

We imagined a typical MR scenario where there are a number () of genetic variants that may be used as instruments, but it is not known how many of them satisfy the IV assumptions. We analyzed the robustness of the causal effect estimate produced by our method to the presence of pleiotropy. We first treated the case where all variants were valid instruments. We then added pleiotropic effects to the genetic variants, such that only 80%, 60%, 40%, 20%, and finally 0% of them were valid instruments. We generated data according to the Bayesian model specification in Section 3 (see Figure 8

). The effect allele frequencies of each genetic variant were randomly simulated from a uniform distribution. We simulated the instrument strengths (

) from a normal distribution with standard deviation . The average absolute strength of the instruments is the mean of the corresponding half-normal distribution: . We simulated the intrinsic noise of the exposure and outcome from a half-normal distribution with standard deviation equal to one, which means that . The causal effect from to was set at . The pleiotropic effects introduced were simulated from a normal distribution with mean zero and standard deviation , which means that they have a similar magnitude to that of the instrument strengths. This type of pleiotropy is referred to as balanced in the literature. Bowden et al. (2017)

Figure 15: The effect of introducing pleiotropic effects on the posterior estimates is that the distribution moves away from the true value

. The shaded area in each posterior distribution corresponds to the 50% posterior uncertainty (credible) interval, with the posterior median in the center depicted with a vertical line. In the worst case scenario, where no genetic variant is a valid instrument, we observe the appearance of a second mode of the distribution, which is close to zero. This mode corresponds to the model explanation of the data where there is a ‘weak’ causal effect from exposure to outcome. We notice, however, that the posterior distribution progression is gradual, thereby showcasing the robustness of

BayesMR to the presence of pleiotropy. When only 40% of the genetic variants were valid instruments, the posterior distribution remained robustly centered around . Even when none of the genetic variants satisfied the IV assumptions, a significant proportion of the probability mass could be found around the true value.
Figure 16: Posterior distribution of for different hyperparameter settings (only is varied) in the near-LCD scenario (Figure 14). The true value for is depicted by a dashed vertical line.

For the inference, we set the prior hyperparameters to . The weights indicating the proportion of relevant genetic effects are learned from the data ( and ), while for the other parameters having a Gaussian mixture prior, the weight hyperparameter is set to . In Figure 16 we show how the causal effect estimate is affected as fewer and fewer of the genetic variants are valid instruments. We notice that even when only 40% of the genetic variants satisfied the IV assumptions, the posterior distribution remained robustly centered around the true value . When none of the genetic variants were valid instrumental variables, this was properly reflected in the uncertainty of the posterior distribution. In that particular case, the posterior became bimodal, suggesting two explanations for the observed associations, one where there is a strong causal link between exposure and outcome (the peak around one) and one when the associations are due to a combination of pleiotropic genetic effects and confounding (the peak close to zero).

5.2 Sensitivity to prior hyperparameters

In this simulation, we analyzed how the posterior computed by BayesMR is influenced by the choice of hyperparameters in the Gaussian mixture prior. For this purpose, we again considered the near-LCD scenario introduced in Subsection 4.4 (see Figure 14). We fixed and varied the parameter in equally sized steps on the logarithmic scale. In this configuration, the higher variance (‘slab’) component of the Gaussian mixture prior has fixed width, while the lower variance (‘spike’) component’s width depends on . The mixture weights were fixed at 0.5. For each of the structural parameters on which we put a Gaussian mixture parameter, namely , we used the same hyperparameter configuration. The posterior distributions obtained for each value of are shown in Figure 16. We notice that for every different spike width, the posterior centers around the correct value . For narrower spikes, the posterior becomes multimodal, indicating other possible sparse solutions. For example, the mode centered around corresponds to the ‘weak pleiotropy’ solution, where . The mode centered at zero, on the other hand, indicates that there is a very small chance that there is no causal effect from to .

The obtained posterior is sensitive to the hyperparameter choice, as reflected by the different mode heights in the multimodal posterior. This is due to the fact that the prior incorporates our belief regarding the magnitude of ‘weak’ and ‘strong’ effects. For instance, the appearance of the second mode around for smaller values of can be explained by the change in the prior belief that the pleiotropic effect, which is of magnitude , is relevant or irrelevant. For , an effect of magnitude has a prior odds of coming from the ‘spike’ (irrelevant interaction) instead of the ‘slab’ (relevant interaction) roughly equal to . This means that for

, it is easy to ‘fit’ a parameter of that magnitude into the ‘spike’ component, where it is assigned a relatively high prior probability which is similar to the prior probability other small values, including zero. In that case, a solution close to the ground truth, where the confounding and pleiotropy parameters are fit to irrelevant values, will have a high posterior probability. This is reflected by the posterior mode for

around in Figure 16. In contrast, when , then the prior odds for an effect of magnitude to come from the ‘spike’, instead of the ‘slab’, is of order . This means that an effect of magnitude is a-priori extremely likely to be considered relevant. Because of the sparsifying Gaussian mixture prior, it is however much more advantageous to fit an irrelevant horizontal pleiotropy parameter instead of a relevant one, even though the ground truth is . This means that the statistically equivalent (resulting in the same covariance matrix) solution where and now has a high enough posterior probability, which leads to the appearance of the mode around .

Even though, as we have shown, the posterior distribution is driven by the choice of prior hyperparameters, our approach also enables the researcher to easily incorporate useful background knowledge, such as previously established weak or strong interactions, into the system. The advantage lies in the intuitive interpretation of the hyperparameters for the Gaussian mixture prior (see Subsection 3.4).

5.3 Model averaging robustness

In this experiment, we considered a bidirectional Mendelian randomization setting (see Subsection 2.3.2), where we cannot exclude the presence of pleiotropic effects. For simplicity, we considered a single genetic variant known to be associated with and a single genetic variant associated with . The true causal direction in the generating model was from to . This generating model is described by the set of equations:

(11)

where and are the instrument strengths (for and for respectively), and are the pleiotropic (secondary) effects, and are the confounding coefficients and is the causal effect of interest.

In our simulation, the genetic variants were strongly associated with their respective phenotypes: . The causal effect parameter was set to . We varied the pleiotropic effects for each simulation by setting , with . At the same time, we considered strong confounding: , while the error terms were normally distributed with dispersion given by . We generated 10000 samples from the above structural equation model with these parameter configurations.

We used BayesMR to determine how much more likely the link is than and to produce a robust causal estimate by combining the estimates for both causal directions. The results of a bidirectional MR analysis, which amounts to computing the Wald ratio from Equation (2) for both causal directions, are shown in Table 1 for each value of . When , the conditions are ideal for the application of bidirectional Mendelian randomization, since both genetic variants constitute valid instruments for their respective phenotypes. In this case, bidirectional MR would yield an estimate of (95% CI [0.934, 1.078]) for the causal direction and (95% CI [-0.025, 0.062]) for the causal direction. Since the causal effect estimate is very close to zero for the link , the inferred direction would be , for which the causal effect estimate is close to the true value.

Table 1: We show the results of performing a simple bidirectional Mendelian randomization analysis for various degrees of pleiotropy. In the first step, we used as an instrument for to estimate the causal effect (the correct direction). We then used as an instrument for to estimate the causal effect (the wrong direction). We compared these estimates against our posterior probability estimate of given the data, in which analysis we used both instruments concomitantly.

In the presence of pleiotropic effects (), the bidirectional MR analysis is no longer suitable, since it is possible to arrive at a nonzero causal effect in both directions. In Table 1, we obtain a nonzero causal effect for both and anytime . Furthermore, it is possible that the causal effect estimated in the wrong direction () is stronger in magnitude than the causal effect estimated in the correct direction (). This shows that bidirectional MR is not a reliable means of inferring the causal direction when dealing with pleiotropic genetic variants. Our approach, on the other hand, was robust to the presence of weak pleiotropic effects. In the last column of Table 1, we observe a significant preference for the correct direction when the pleiotropic effects are small.

5.4 Sensitivity to nonlinearity and nonnormality

Our assumed Bayesian model, as described in Section 3, entails a linear relationship between the exposure and the outcome , as well as a conditional Gaussian distribution given the instrument values, . In this section, we evaluate the robustness of BayesMR in situations where the parametric assumptions of linearity and normality are violated. We revisit the near-LCD scenario from Subsection 4.4, where our approach is able to recover the correct direction of the causal relationship with high probability and to produce a reasonable estimate for the causal effect, and use it as the starting point for our sensitivity analysis.

A Q1 Median Mean Q3
Table 2: We show the results of running BayesMR on data generated from the model in Figure 14 for various degrees of nonlinearity, as parametrized by (lower corresponds to stronger nonlinearity). We report the 95% CI for the estimated probability that the direction of the causal link is , as opposed to . For the causal effect estimate

, we report summary measures, including the first (Q1) and third quartile (Q3).

Q1 Median Mean Q3
Table 3: We show the results of running BayesMR on data generated from the model in Figure 14

for various degrees of nonnormality, as parametrized by the number of degrees of freedom of the

-distributed noise (lower corresponds to stronger nonnormality). We report the 95% CI for the estimated probability that the direction of the causal link is , as opposed to . For the causal effect estimate , we report summary measures, including the first (Q1) and third quartile (Q3).

We first explore the situation in which the relationship between exposure and outcome is nonlinear. To make the results comparable to the linear case, we have simulated data where the linear term in the structural equation of from  (1) is replaced by , where the parameter controls the degree of nonlinearity. In the limit , , so we recover linearity. Furthermore, the coefficient of the linear tangent approximation at has the same value (one) as the structural coefficient in the near-LCD scenario. We keep the other model parameters as in the example in Subsection 4.4. The results in Table 2 show that despite a nonlinear dependence, we are still able to detect the causal effect and its sign. Moreover, the method returns a strong preference for the correct direction for all values of . The causal effect estimate remains robust against small to medium deviations from the linear case. It is important to note that when the linearity assumption does not hold, then the estimate returned by our method cannot be interpreted as the effect of an increase in the exposure for an individual, but represents an average causal effect across the population. Burgess et al. (2014) In case the exposure-outcome relationship is non-monotonic, for instance if it is U-shaped, then it is no longer guaranteed that the method can infer any causal effect. One possible solution then would be to perform a piecewise Mendelian randomization analysis, as suggested by Burgess et al. Burgess et al. (2014) The idea is to stratify on the “IV-free” exposure distribution so that the localized exposure-outcome relation is approximately linear and then run BayesMR on each stratum.

In the second simulation, we analyze the effects of including non-Gaussian noise in the generating model. To make the results comparable to the Gaussian case, we have simulated data where the noise term in the structural equation of from (1) is distributed according to Student’s -distribution. Here, the -distribution’s number of degrees of freedom controls the degree of normality. In the limit , the -distribution becomes a standard normal distribution, so we recover normality. We keep the other model parameters as in the example in Subsection 4.4. The results in Table 3 show that the causal effect estimate for remains robust for a large range of values for , corresponding to small to medium deviations from the linear case. Moreover, the method returns a strong preference for the correct direction of the causal link for all values of . When , which corresponds to a -distribution with undefined mean and variance, our approach can no longer detect the causal link from to , as the posterior distribution of the causal effect estimate becomes too broad. One potential solution for dealing with strong violations of the normality assumption would then be to adopt a non-Gaussian structural equation model, in the vein of Shimizu and Bollen. Shimizu and Bollen (2014)

6 Real-world applications

In this section, we will showcase the potential of BayesMR by applying it to a number of real-world problems. We start, however, by explaining how our method can be applied when only summary data is available. In the first application, we focus on estimating the causal effect of birth weight on adult fasting glucose levels. In the second application, we analyze the effect of body mass index on the risk of Parkinson’s disease. In the third and final application, we use our approach to examine the direction of the causal link between coffee consumption and cigarette smoking.

6.1 Using summary data

BayesMR

requires as input the first and second-order moments of the observed data vector

. These moments are needed in the expression of the likelihood function and constitute the sufficient statistics with respect to our model (Section 3). We can easily compute estimates for these moments when individual-level data is available. However, we often only have access to summary statistics in the form of regression (beta) coefficients and their standard errors. In this subsection, we show how to derive approximations for the first and second-order moments from the summary data in order to obtain the required sufficient statistics for our method. We require (at least) the following summary statistics:

  • : the effect allele frequency (EAF) of

  • : the number of allele copies (very often equal to two, since humans are diploid organisms)

  • : measures of the gene-exposure association, including the coefficient obtained by regressing on , its standard error and the sample size

  • : measures of the gene-outcome association, including the coefficient obtained by regressing on , its standard error and the sample size

  • : the coefficient obtained by regressing and (observational exposure-outcome association)

Summary data on gene-exposure and gene-outcome associations from genome-wide association studies has become increasingly available, so we can typically get estimates for , together with the associated standard errors and sample sizes. The effect allele frequency is usually also reported. In addition, we require a measure of the association between the exposure and the outcome () to derive an estimate of . This estimate can be obtained from observational studies for determining potential risk factors for the outcome.

To derive the expected values and variances for the genetic variants, we employ the binomial distribution assumption by plugging in the EAF as the estimated success probability and using the appropriate formulas. We can assume without loss of generality that the mean parameters and in Equation (4) are zero, as their location does not influence the regression slope. For estimating the second-order moments, we then employ the following well-known approximations from simple linear regression:

Note that these approximations also apply in a multivariate setting when the regressors are independent. We use these approximations to finally derive the following estimates for the moments from summary statistics:

(12)

When we have information on multiple genetic variants, we obtain multiple estimates of and in (12), in which case we use the average over the estimates. Our approach also requires specifying a sample size. Since the summary statistics are likely to be computed from different samples, we conservatively choose the minimum of their sizes as input to BayesMR in order not to overestimate the precision of the data. If the sample size for the exposure-outcome association measure is also available, we take it into consideration when calculating the minimum of the sample sizes.

6.2 Effect of birth weight on fasting glucose

Del Greco et al. Del Greco et al. (2015) performed an IVW meta-analysis of MR Wald estimates (see Equations (2), (3)) to analyze the relationship between low birth weight and adult fasting glucose levels. The authors chose seven genetic variants associated with birth weight to use as instruments in an MR analysis. The meta-analysis of the seven MR estimates suggested a significant protective effect of -0.155 (95% CI [-0.233, -0.088]) reduction in adult fasting glucose level per standard deviation increase (484 g) in birth weight. In their analysis, Del Greco et al. Del Greco et al. (2015) investigated the presence of pleiotropy by means of the between-instrument heterogeneity test. They reported significant evidence of heterogeneity across instruments (), which they believe suggests that IV3 might be violated for some of the genetic variants. The heterogeneity was primarily due to the variant rs9883024 in gene ADCY5. However, even after removing said variant from the analysis, they obtained a significant negative effect estimate of -0.098 (95% CI [-0.168, -0.027]).

Figure 17: Estimated posterior distribution for the causal effect of birth weight on adult fasting glucose levels. The light shaded area in the posterior represents the interquartile range (IQR), while the dark shaded line indicates the median. For the Gaussian mixture prior in Equation (7), we have taken and . The IVW estimate reported in Del Greco et al. Del Greco et al. (2015) and its confidence bounds are shown for comparison.
Figure 18: By adapting the prior knowledge to fit the classic instrumental variable setting, we are able to recover the IVW estimate. The light shaded area in the posterior represents the interquartile range (IQR), while the dark shaded line indicates the median. For the Gaussian mixture prior in Equation (7), we have taken and , respectively. The IVW estimate reported in Del Greco et al. Del Greco et al. (2015) and its confidence bounds are shown for comparison.

We ran our model using the summary statistics for the genotype-exposure and genotype-outcome associations provided in Del Greco et al. Del Greco et al. (2015) (Table IV). The regression coefficient for the observational exposure-outcome association is taken from Daly et al. Daly et al. (2005), who reported a decrease of 0.01 in fasting glucose levels per kilogram increase in birth weight in a retrospective cohort study on a multiethnic sample of 855 New Zealand adolescents. The regression coefficients they computed were adjusted for age, sex, and ethnicity. One potential limitation of using this data is that there may be biases induced by the different adjustments made in the study. Lawlor (2016) Another potential issue is the fact that the observational study sample is taken from an underlying population which is different in terms of ethnicity and age. Thompson et al. (2016) Nevertheless, the associations found in the observational study are consistent with those found in subsequent studies. Norris et al. (2011)

Our estimated posterior in Figure 18 indicates that no causal effect is necessary to explain the observed data. This conclusion differs significantly from that of Del Greco et al. Del Greco et al. (2015) because we are making less stringent assumptions regarding the parameter strengths. However, we can arrive at their conclusion if we adapt our prior knowledge by incorporating stronger prior assumptions. To obtain the posterior distribution in Figure 18, we assumed that the pleiotropic effects are negligible (by setting , equivalent to a weaker form of IV3) and that the instrument strengths are relevant (by setting , equivalent to a weaker form of IV1). Additionally, we set in the Gaussian mixture prior (Equation (7)) for and , so as to not penalize (potentially) relevant confounding and causal effects too strongly. Since this new set of assumptions is much closer to the typical assumptions made in an IV analysis, we were able to recover the IVW estimate. Our results show, however, that if we penalize every strong interaction between variables equally (see Subsection 3.4), then the posterior estimate of indicates a negligible causal effect of birth weight on adult fasting glucose.

6.3 Effect of body mass index on the risk of Parkinson’s disease

Figure 19: Estimated causal effect of BMI on the risk of PD expressed as the difference in log-odds of PD per 5 increase in BMI. The light shaded area in the posterior represents the interquartile range (IQR), while the dark shaded line indicates the median. The IVW estimate derived from (3) along with its 95% confidence bounds are shown for comparison.

There have been conflicting findings regarding the association between body mass index (BMI) and Parkinson’s disease (PD) in observational studies. Both positive and negative associations between higher BMI and the risk of PD have been reported. Noyce et al. (2017) Wang et al. Wang et al. (29-Jun-2015) performed a meta-analysis of ten cohort studies and their obtained pooled risk ratio (RR) for the association of a 5 higher BMI with the risk of PD was 1.00 (95% CI [0.89, 1.12]), which suggests that BMI is not associated with the risk of PD. In a Mendelian randomization analysis, Noyce et al. Noyce et al. (2017) estimated the causal effect of BMI on the risk of PD by combining the ratio estimates corresponding to 77 selected SNPs via an inverse-variance weighted scheme. The results of their analysis suggest a putative protective causal effect of BMI. More specifically, their MR analysis yielded an IVW log-odds ratio (log-OR) of -0.195 (95% CI [-0.368, -0.021]), meaning that a lifetime exposure of 5 higher BMI was associated with a lower risk of PD.

We also took the log-odds ratio as the continuous outcome variable. We used the summary data from Noyce et al. Noyce et al. (2017) for the genetic associations with BMI and the log-OR of PD. We used the RR reported by Wang et al. Wang et al. (29-Jun-2015) as an approximation for the OR, since the incidence of PD is very low, resulting in an estimated log-OR of zero for the observational association of a 5 higher BMI with the risk of PD. We fitted our model assuming the direction of causality from BMI to PD and we obtained a causal effect estimate centered around zero, with 96.3% of the probability mass in the interval [-0.1, 0.1] (see Figure 19). This result casts doubt on the existence of a causal link between BMI and the risk of PD. When looking at the scatter plot comparing the genetic associations of the 77 variants with the outcomes and their associations with the exposures (Figure 21

), we observe the existence of two outliers, corresponding to the red triangles. These two variants show a low association with the outcome relative to the others given their strength as instruments. It is possible that the unusually large negative association with the risk of PD is due to unobserved pleiotropic effects. This claim is supported by our estimates of the pleiotropic effect

for these two genetic variants, which are shown in Figure 21.

To further substantiate our claim that these variants represent outliers for the IVW and MR-Egger analysis, we used the radial regression approach of Bowden et al. Bowden et al. (2017). By running the radial IVW and MR-Egger regressions using first order weights and specifying a statistical significance threshold of 0.01, we discovered the same two outliers observed visually in the scatter plot. Alternatively, we used the Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) test Verbanck et al. (2018) to identify potential pleiotropic outliers. With MR-PRESSO, we detected the two genetic variants highlighted in Figure 21 as outliers for a significance threshold of 0.1. The outlier-corrected causal estimate produced by MR-PRESSO remained ‘barely’ significant at the 0.05 level: = -0.1669 (95% CI [-0.3335, -0.0003]).

Figure 20: Scatter plot of the genetic associations with BMI (horizontal axis) and PD risk (vertical axis) for 77 genetic variants. The two outliers (triangles) show a relatively strong association with the outcome given their association with the exposure. The regression line including the outliers is dashed, while the regression line obtained without the outliers is continuous.
Figure 21: Estimated pleiotropic effects for the two genetic variants suspected of being pleiotropic outliers (rs17001654 and rs13107325). The light shaded area in the posterior represents the interquartile range (IQR), while the dark shaded line indicates the median. Most of the posterior mass is distributed away from zero, thereby supporting the suspicion that these two variants exhibit horizontal pleiotropy.

To sum up, the results of our investigation suggest that the protective causal effect of BMI on the risk of PD discovered by Noyce et al. Noyce et al. (2017) could also be due to some of the genetic variants exhibiting horizontal pleiotropy. We suspect two variants in particular to be pleiotropic outliers in the MR analysis, as suggested by our method and other approaches. Bowden et al. (2017); Verbanck et al. (2018)

6.4 Does coffee consumption influence smoking?

In this experiment, we investigated the association between coffee consumption and cigarette smoking. It is unclear whether this association is causal. A Mendelian randomization study performed by Treur et al. Treur et al. (2016) provided no evidence for causal effects of smoking on caffeine or vice versa. Bjørngaard et al., Bjørngaard et al. (2017) on the other hand, found that higher cigarette consumption causally increases coffee intake. If a causal link between coffee consumption and cigarette smoking exists, its direction is also unclear. Verweij et al. Verweij et al. (2018) employed a two-sample bidirectional Mendelian randomization study to investigate, among other things, the causal association between the use of nicotine and caffeine, but found little evidence of a causal relationship in either direction. In another study, Ware et al. Ware et al. (2017) assessed the impact of coffee consumption on the heaviness of smoking, but also obtained inconclusive results: one of their two-sample MR analyses indicated that heavier consumption of caffeine might lead to reduced heaviness of smoking, while in other MR analyses they found no evidence of a causal relationship between coffee consumption and heaviness of smoking. Ware et al. Ware et al. (2017) concluded it is unlikely that coffee consumption has a major causal impact on cigarette smoking and suggested the possibility of reverse causation, i.e., smoking impacting coffee consumption, or confounding as alternative explanations for the observed association.

We used the summary statistics reported by Ware et al. Ware et al. (2017) to explore this association. The exposure variable is coffee consumption measured in cups per day, while the outcome variable is smoking measured in cigarettes per day. The summary measurements for the gene-exposure association were taken from the European replication sample ( 30 062) of the Coffee and Caffeine Genetics Consortium (CCGC) genome-wide association study meta-analysis. Cornelis et al. (2015) Eight independent coffee related variants meeting the threshold for genome-wide significance in the trans-ethnic GWAS meta-analysis were considered. The associations of coffee-related SNPs with the outcome were obtained from the UK Biobank ( 8 072). Ware et al. Ware et al. (2017) also computed an observational association of 0.45 additional cigarettes per day for each consumed cup of coffee among the 8 072 current daily smokers who reported consuming coffee.

(a) Posterior distribution of the putative causal effect of coffee consumption on smoking. In the case of reverse causation (causal link from smoking to coffee consumption), this effect is zero, as indicated by the vertical dashed line. The estimate next to the line (54.67%) is the evidence for the reverse model.
(b) Posterior distribution of the putative causal effect of smoking on coffee consumption. In the case of reverse causation (causal link from coffee consumption to smoking), this effect is zero, as indicated by the vertical dashed line. The estimate next to the line (45.33%) is the evidence for the reverse model.
Figure 22: Comparison of the causal effect estimates between (coffee consumption in cups per day) and (heaviness of smoking in cigarettes per day) for the two possible causal directions. The estimated evidence for the two models is and , respectively. In the left figure, we see the estimate of , which is the causal effect of coffee consumption on heaviness of smoking, under the assumption that the causal link exists. In the right figure, we see the estimate of , which is the causal effect of heaviness of smoking on coffee consumption, under the assumption that the causal link exists.

In our approach, we considered two candidate models corresponding to the two possible causal directions. In the first model (Figure 7), we assumed a causal link from coffee consumption to smoking, while in the second model Figure 7) we assumed the reverse causal link. We computed the evidence for the two models and then, assuming a-priori that either of the two models is equally likely, we obtained a posterior probability of 42.6% for the first model and 57.4% for the second model. The posterior over the causal effect is shown for both directions in Figures 21(a) and 21(b). According to our results, there is significant uncertainty in the direction of the causal effect. This suggests that the observed association may in fact be due to some confounding factor. Still, that does not exclude the possibility of a causal effect between coffee consumption and heaviness of smoking, but more data is required to substantiate this claim.

7 Discussion

In this paper, we have proposed a Bayesian approach to Mendelian randomization (BayesMR) for finding the probable direction of a causal link between two phenotypes and for estimating the magnitude of its causal effect. The novelty in our method consists in inferring the direction of causality, estimating the causal effects and promoting model sparsity in a single-step approach. This is achieved by using a nested sampling scheme Skilling (2006); Handley et al. (2015, 2015) to compute the model evidence for both causal directions and to sample from the parameter posterior distribution simultaneously.

Many traditional MR methods can be seen as limiting cases of our more general approach. For example, the “no pleiotropy” assumption Bowden et al. (2017) made when using genetic variants as instrumental variables can be incorporated by setting the prior proportion of the slab component for the pleiotropic effect to zero () and then taking . Berzuini et al. Berzuini et al. (2018), for example, have proposed a similar Bayesian approach, but they assume a particular causal direction. This assumption can be incorporated into our model by setting the prior probability of reverse causation to be zero, as shown in Section 4. Our Bayesian solution subsumes standard Mendelian randomization (Figure 4) and the LCD pattern (Figure 5), is robust to the presence of pleiotropic effects (Figure 7), and incorporates the possibility of reverse causation in the exposure-outcome association (Figure 7). The resulting output appropriately reflects the uncertainty of having to consider two possible causal directions due to the existence of alternative causal pathways.

In the era of high-throughput genomics, it is becoming increasingly likely to select invalid instruments as GWAS sample sizes continue to grow. Hemani et al. (2017) Our approach allows for the selection of a much broader range of genetic variants, since it does not require any of the instruments to be valid. Furthermore, we can select even those variants for which it is not clear whether they primarily influence the exposure or the outcome. In this sense, we envision running BayesMR on a large number of potential genetic candidates collected from genome-wide association studies, which can be related to either the exposure or the outcome. Another advantage of our method is that it does not rely on individual-level data. This is becoming increasingly important in the era of large GWA studies, Visscher et al. (2017) where summary statistics regarding genetic associations obtained from large independent samples are made publicly available. Our approach, however, is currently not designed to handle potentially correlated genetic variants, which means it cannot use variants that are in linkage disequilibrium with each other as input. Another concern is the computational scalability of the Bayesian inference to a large number of instrumental variables. In the future, we plan to explore more scalable nested sampling approaches, for instance the one proposed by Buchner, Buchner (2017) and to extend BayesMR to account for correlation among genetic variants.

A key aspect of our approach lies in the structural assumption that interactions between variables are either ‘weak’ (irrelevant) or ‘strong’ (relevant). Bucur et al. (2017) This assumption is different from the traditional zero-nonzero dichotomy and, in our opinion, more realistic than assuming, for example, that a genetic effect on an outcome variable is “completely mediated” by an another variable. The Gaussian mixture prior is a natural choice for expressing this assumption and has the intuitive interpretation of capturing the small, irrelevant effects in the ‘spike’ (lower variance) component and the large, relevant effects in the ‘slab’ (higher variance) component. The mixture Gaussian prior has already been used for example by Li Li (2017) to induce sparsity in the estimation of pleiotropic effects. Here, we extend the usage of this prior to the other structural parameters, which allows for a more general view of Mendelian randomization analyses. For example, putting a Gaussian mixture prior on the strength of instruments will enable us to make an automatic selection of the relevant instruments in a batch of preselected genetic variants, where the proportion of relevant instruments can be determined by learning the shared hyperparameter .

Our chosen prior is informative in the sense that it informs which effects we consider a-priori to be relevant. Because of this, care must be taken when setting the prior hyperparameters. While this informativeness may be seen as a weakness of the Bayesian approach, it also empowers the research to input sensible assumptions regarding the expected magnitude of effects in an intuitive fashion. If one has a prior idea regarding which effect sizes are deemed relevant and which irrelevant, then one can appropriately tune the variances of the ‘spike’ and the ‘slab’ to reflect this belief. When it is not clear how to distinguish between relevant and irrelevant effects, one can treat all detectable effects as relevant by letting in a first attempt. At the same time, one can choose large enough to give support to effect sizes that are substantively different from zero, but not so large that unrealistic values are supported. George and McCulloch (1993) Furthermore, if one has a prior belief regarding the relevance of a particular parameter, this can be expressed in the prior by appropriately setting the parameter, which determines the mixture component proportion. Other sparsifying priors have been proposed for use in MR Bayesian methods, such as the horseshoe prior Berzuini et al. (2018) or the Laplace priorAgakov et al. (2010) It might be interesting to compare the inferences obtained by using these different priors and to assess how easy it is to incorporate prior biological knowledge for each of them.

One important limitation of the approach proposed here lies in the strong parametric assumptions of linearity and Gaussianity. Didelez and Sheehan (2007) These standard assumptions are commonly made for simplicity and computational convenience in similar works such as those by Thompson et al. Thompson et al. (2017), Li Li (2017), or Berzuini et al. Berzuini et al. (2018), but when they do not hold, the estimands can potentially be far off the mark and their interpretation is rendered incorrect. We have shown that BayesMR is robust against mild violations of linearity and Gaussianity, even though great care must be taken when interpreting the results. In future work, it would be interesting to test our method as well as other established methods against a broader range of violations. The current method is also not directly applicable to discrete phenotypic variables. However, as we have shown in Section 6

, we can incorporate log-odds ratios of binary values in our model by applying a logit transformation. With our approach, we allow for violations of the exclusion restriction assumption (

IV3) and we even allow for violations of IV1 (genetic variant is not robustly associated with the exposure), although this is typically not considered an issue. However, we still rely on the genetic variants being independent from any unmeasured confounding variables (IV2).

For future work, we plan to make our approach even more general by relaxing the IV2 assumption, and therefore taking into account the possibility that the genetic variables might be associated with unmeasured confounders. This would also mean that the InSIDE (Instrument Strength is Independent from Direct Effect) assumption, which is required for applying MR-Egger Bowden et al. (2015) or the Bayesian method proposed by Berzuini et al. Berzuini et al. (2018), does not hold. Another immediate extension to our work is handling potential measurement error for the exposure and outcome variables. Traditional methods such as MR-Egger rely on having no measurement error in the exposure (the so-called NOME Bowden et al. (2018) assumption), which is difficult to achieve in practical applications and may lead to erroneous results. Finally, we are also interested in analyzing the links between variables that exert a mutual causal influence on each other. For this purpose, we intend to extend BayesMR to handle cyclic causal models.

The code implementing the proposed method will be made publicly available by the authors at https://github.com/igbucur/BayesMR.

References

  • F. V. Agakov, P. McKeigue, J. Krohn, and A. J. Storkey (2010) Sparse Instrumental Variables (SPIV) for Genome-Wide Studies. In Advances in Neural Information Processing Systems 23, J. D. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, and A. Culotta (Eds.), pp. 28–36. Cited by: §2.3.2, §7.
  • J. E. Aten, T. F. Fuller, A. J. Lusis, and S. Horvath (2008) Using Genetic Markers to Orient the Edges in Quantitative Trait Networks: The NEO Software. BMC Systems Biology 2, pp. 34. External Links: ISSN 1752-0509 Cited by: §2.3.2.
  • C. Berzuini, H. Guo, S. Burgess, and L. Bernardinelli (2018) A Bayesian Approach to Mendelian Randomization with Multiple Pleiotropic Variants. Biostatistics (en). External Links: Document Cited by: §2.3.1, §2.3.1, §2.4, §3.2, §3.3, §3.3, §3.4, §3.4, §7, §7, §7, §7.
  • C. Bishop (2006) Pattern Recognition and Machine Learning. Information Science and Statistics, Springer-Verlag, New York (en). External Links: ISBN 978-0-387-31073-2 Cited by: §2.3.1, §4.1, §4.1.
  • J. H. Bjørngaard, A. T. Nordestgaard, A. E. Taylor, J. L. Treur, M. E. Gabrielsen, M. R. Munafò, B. G. Nordestgaard, B. O. AAsvold, P. Romundstad, and G. Davey Smith (2017) Heavier Smoking Increases Coffee Consumption: Findings from a Mendelian Randomization Analysis. International Journal of Epidemiology 46 (6), pp. 1958–1967 (en). External Links: ISSN 0300-5771 Cited by: §6.4.
  • J. Bowden, G. Davey Smith, and S. Burgess (2015) Mendelian Randomization with Invalid Instruments: Effect Estimation and Bias Detection through Egger Regression. International Journal of Epidemiology 44 (2), pp. 512–525 (en). External Links: ISSN 0300-5771 Cited by: §2.2, §2.2, §2.3.1, §2.3.1, §3.2, §7.
  • J. Bowden, G. Davey Smith, P. C. Haycock, and S. Burgess (2016) Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genetic Epidemiology 40 (4), pp. 304–314 (en). External Links: ISSN 1098-2272 Cited by: §2.3.1, §2.3.1.
  • J. Bowden, F. M. Del Greco, C. Minelli, G. Davey Smith, N. Sheehan, and J. Thompson (2017) A Framework for the Investigation of Pleiotropy in Two-Sample Summary Data Mendelian Randomization. Statistics in Medicine 36 (11), pp. 1783–1802 (en). External Links: ISSN 1097-0258 Cited by: §5.1, §7.
  • J. Bowden, F. Del Greco, C. Minelli, D. Lawlor, Q. Zhao, N. A. Sheehan, J. Thompson, and G. D. Smith (2018) Improving the Accuracy of Two-Sample Summary Data Mendelian Randomization: Moving beyond the NOME Assumption. bioRxiv, pp. 159442 (en). Cited by: §7.
  • J. Bowden, W. Spiller, F. Del-Greco, N. Sheehan, J. Thompson, C. Minelli, and G. D. Smith (2017) Improving the Visualisation, Interpretation and Analysis of Two-Sample Summary Data Mendelian Randomization via the Radial Plot and Radial Regression. bioRxiv, pp. 200378 (en). Cited by: §6.3, §6.3.
  • R. J. Bowden and D. A. Turkington (1990) Instrumental Variables. Cambridge University Press (en). External Links: ISBN 978-0-521-38582-4 Cited by: §2.1.
  • M. A. Brion, K. Shakhbazov, and P. M. Visscher (2013) Calculating Statistical Power in Mendelian Randomization Studies. International Journal of Epidemiology 42 (5), pp. 1497–1501 (en). External Links: ISSN 0300-5771 Cited by: §2.2.
  • J. Buchner (2017) Collaborative Nested Sampling: Big Data vs. complex physical models. arXiv:1707.04476 [astro-ph, physics:physics, stat]. Cited by: §7.
  • I. G. Bucur, T. Claassen, and T. Heskes (2017) Robust Causal Estimation in the Large-Sample Limit without Strict Faithfulness. In Artificial Intelligence and Statistics, pp. 1523–1531 (en). Cited by: §3.4, §7.
  • S. Burgess, A. Butterworth, and S. G. Thompson (2013) Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data. Genetic Epidemiology 37 (7), pp. 658–665 (en). External Links: ISSN 1098-2272 Cited by: §2.2.
  • S. Burgess, N. M. Davies, and S. G. Thompson (2014) Instrumental Variable Analysis with a Nonlinear Exposure–Outcome Relationship. Epidemiology (Cambridge, Mass.) 25 (6), pp. 877–885. External Links: ISSN 1044-3983, Document Cited by: §5.4.
  • S. Burgess, C. N. Foley, and V. Zuber (2018) Inferring Causal Relationships Between Risk Factors and Outcomes from Genome-Wide Association Study Data. Annual Review of Genomics and Human Genetics 19 (1), pp. null. Cited by: §2.3.1.
  • S. Burgess and S. G. Thompson (2017) Interpreting Findings from Mendelian Randomization Using the MR-Egger Method. European Journal of Epidemiology 32 (5), pp. 377–389 (en). External Links: ISSN 0393-2990, 1573-7284 Cited by: §2.3.1, §2.3.1.
  • S. Burgess, V. Zuber, A. Gkatzionis, J. M. B. Rees, and C. Foley (2017) Improving on a Modal-Based Estimation Method: Model Averaging for Consistent and Efficient Estimation in Mendelian Randomization When a Plurality of Candidate Instruments Are Valid. bioRxiv, pp. 175372 (en). Cited by: §2.3.1.
  • C. M. Carvalho, N. G. Polson, and J. G. Scott (2009) Handling Sparsity via the Horseshoe. In Artificial Intelligence and Statistics, pp. 73–80 (en). Cited by: §3.4.
  • L. S. Chen, F. Emmert-Streib, and J. D. Storey (2007) Harnessing Naturally Randomized Transcription to Infer Regulatory Relationships among Genes. Genome Biology 8 (10), pp. R219 (en). External Links: ISSN 1474-760X Cited by: §2.3.
  • L. Chen, G. D. Smith, R. M. Harbord, and S. J. Lewis (2008) Alcohol Intake and Blood Pressure: A Systematic Review Implementing a Mendelian Randomization Approach. PLOS Medicine 5 (3), pp. e52. External Links: ISSN 1549-1676 Cited by: Figure 2, §1.
  • G. F. Cooper (1997) A Simple Constraint-Based Algorithm for Efficiently Mining Observational Databases for Causal Relationships. Data Mining and Knowledge Discovery 1 (2), pp. 203–224 (en). External Links: ISSN 1384-5810, 1573-756X Cited by: Figure 5, §2.3.
  • M. C. Cornelis, E. M. Byrne, T. Esko, M. A. Nalls, A. Ganna, N. Paynter, K. L. Monda, N. Amin, K. Fischer, F. Renstrom, J. S. Ngwa, V. Huikari, A. Cavadino, I. M. Nolte, A. Teumer, K. Yu, P. Marques-Vidal, R. Rawal, A. Manichaikul, M. K. Wojczynski, J. M. Vink, J. H. Zhao, G. Burlutsky, J. Lahti, V. Mikkilä, R. N. Lemaitre, J. Eriksson, S. K. Musani, T. Tanaka, F. Geller, J. Luan, J. Hui, R. Mägi, M. Dimitriou, M. E. Garcia, W.-K. Ho, M. J. Wright, L. M. Rose, P. K. E. Magnusson, N. L. Pedersen, D. Couper, B. A. Oostra, A. Hofman, M. A. Ikram, H. W. Tiemeier, A. G. Uitterlinden, F. J. A. van Rooij, I. Barroso, I. Johansson, L. Xue, M. Kaakinen, L. Milani, C. Power, H. Snieder, R. P. Stolk, S. E. Baumeister, R. Biffar, F. Gu, F. Bastardot, Z. Kutalik, D. R. J. Jr, N. G. Forouhi, E. Mihailov, L. Lind, C. Lindgren, K. Michaëlsson, A. Morris, M. Jensen, K.-T. Khaw, R. N. Luben, J. J. Wang, S. Männistö, M.-M. Perälä, M. Kähönen, T. Lehtimäki, J. Viikari, D. Mozaffarian, K. Mukamal, B. M. Psaty, A. Döring, A. C. Heath, G. W. Montgomery, N. Dahmen, T. Carithers, K. L. Tucker, L. Ferrucci, H. A. Boyd, M. Melbye, J. L. Treur, D. Mellström, J. J. Hottenga, I. Prokopenko, A. Tönjes, P. Deloukas, S. Kanoni, M. Lorentzon, D. K. Houston, Y. Liu, J. Danesh, A. Rasheed, M. A. Mason, A. B. Zonderman, L. Franke, B. S. Kristal, J. Karjalainen, D. R. Reed, H.-J. Westra, M. K. Evans, D. Saleheen, T. B. Harris, G. Dedoussis, G. Curhan, M. Stumvoll, J. Beilby, L. R. Pasquale, B. Feenstra, S. Bandinelli, J. M. Ordovas, A. T. Chan, U. Peters, C. Ohlsson, C. Gieger, N. G. Martin, M. Waldenberger, D. S. Siscovick, O. Raitakari, J. G. Eriksson, P. Mitchell, D. J. Hunter, P. Kraft, E. B. Rimm, D. I. Boomsma, I. B. Borecki, R. J. F. Loos, N. J. Wareham, P. Vollenweider, N. Caporaso, H. J. Grabe, M. L. Neuhouser, B. H. R. Wolffenbuttel, F. B. Hu, E. Hyppönen, M.-R. Järvelin, L. A. Cupples, P. W. Franks, P. M. Ridker, C. M. van Duijn, G. Heiss, A. Metspalu, K. E. North, E. Ingelsson, J. A. Nettleton, R. M. van Dam, and D. I. Chasman (2015) Genome-Wide Meta-Analysis Identifies Six Novel Loci Associated with Habitual Coffee Consumption. Molecular Psychiatry 20 (5), pp. 647–656 (en). External Links: ISSN 1476-5578 Cited by: §6.4.
  • N. Cornia and J. M. Mooij (2014) Type-II Errors of Independence Tests Can Lead to Arbitrarily Large Errors in Estimated Causal Effects: An Illustrative Example. In Proceedings of the UAI 2014 Conference on Causal Inference: Learning and Prediction - Volume 1274, CI’14, Aachen, Germany, Germany, pp. 35–42. Cited by: §2.3.
  • B. Daly, R. Scragg, D. Schaff, and P. A. Metcalf (2005) Low birth weight and cardiovascular risk factors in Auckland adolescents: A retrospective cohort study. http://www.nzma.org.nz/journal/118-1220/1612/. External Links: ISSN 1175-8716 Cited by: §6.2.
  • G. Davey Smith and S. Ebrahim (2003) ‘Mendelian Randomization’: Can Genetic Epidemiology Contribute to Understanding Environmental Determinants of Disease?. International Journal of Epidemiology 32 (1), pp. 1–22 (en). External Links: ISSN 0300-5771 Cited by: §1, §2.1.
  • G. Davey Smith and G. Hemani (2014) Mendelian Randomization: Genetic Anchors for Causal Inference in Epidemiological Studies. Human Molecular Genetics 23 (R1), pp. R89–R98. External Links: ISSN 0964-6906 Cited by: §1, §1, §2.2, §2.2, §2.3.1, §2.3.2, §2.4.
  • G. Davey Smith, D. A. Lawlor, R. Harbord, N. Timpson, I. Day, and S. Ebrahim (2007) Clustered Environments and Randomized Genes: A Fundamental Distinction between Conventional and Genetic Epidemiology. PLOS Medicine 4 (12), pp. e352. External Links: ISSN 1549-1676 Cited by: §2.3.
  • G. Davey Smith, L. Paternoster, and C. Relton (2017) When Will Mendelian Randomization Become Relevant for Clinical Practice and Public Health?. JAMA 317 (6), pp. 589–591 (en). External Links: ISSN 0098-7484 Cited by: §1.
  • A. P. Dawid (1979) Conditional Independence in Statistical Theory. Journal of the Royal Statistical Society. Series B (Methodological) 41 (1), pp. 1–31. External Links: ISSN 0035-9246 Cited by: §3.2.
  • F. M. Del Greco, C. Minelli, N. A. Sheehan, and J. R. Thompson (2015) Detecting Pleiotropy in Mendelian Randomisation Studies with Summary Data and a Continuous Outcome. Statistics in Medicine 34 (21), pp. 2926–2940 (en). External Links: ISSN 1097-0258 Cited by: Figure 18, §6.2, §6.2, §6.2.
  • V. Didelez and N. Sheehan (2007) Mendelian Randomization as an Instrumental Variable Approach to Causal Inference. Statistical Methods in Medical Research 16 (4), pp. 309–330 (en). External Links: ISSN 0962-2802 Cited by: §2.1, §2.2, §2.3.1, §2.3, §7.
  • B. A. Ference, W. Yoo, I. Alesh, N. Mahajan, K. K. Mirowska, A. Mewada, J. Kahn, L. Afonso, K. A. Williams, and J. M. Flack (2012) Effect of Long-Term Exposure to Lower Low-Density Lipoprotein Cholesterol Beginning Early in Life on the Risk of Coronary Heart Disease: A Mendelian Randomization Analysis. Journal of the American College of Cardiology 60 (25), pp. 2631–2639. External Links: ISSN 0735-1097 Cited by: Figure 2, §1.
  • K. M. Flegal, B. I. Graubard, D. F. Williamson, and R. S. Cooper (2011) Reverse Causation and Illness-related Weight Loss in Observational Studies of Body Weight and Mortality. American Journal of Epidemiology 173 (1), pp. 1–9 (en). External Links: ISSN 0002-9262, Document Cited by: Figure 7.
  • R. H. Fletcher, S. W. Fletcher, and G. S. Fletcher (2012) Clinical Epidemiology: The Essentials. Lippincott Williams & Wilkins (en). External Links: ISBN 978-1-4511-4447-5 Cited by: §1.
  • A. Gangloff, J. Bergeron, E. Pelletier-Beaumont, J.-A. Nazare, J. Smith, A.-L. Borel, I. Lemieux, A. Tremblay, P. Poirier, N. Alméras, and J.-P. Després (2015) Effect of adipose tissue volume loss on circulating 25-hydroxyvitamin D levels: results from a 1-year lifestyle intervention in viscerally obese men. International Journal of Obesity 39 (11), pp. 1638–1643 (en). External Links: ISSN 1476-5497, Document Cited by: §1.
  • A. Gelman (2006) Prior Distributions for Variance Parameters in Hierarchical Models (Comment on Article by Browne and Draper). Bayesian Analysis 1 (3), pp. 515–534 (EN). External Links: ISSN 1936-0975, 1931-6690 Cited by: item .
  • E. I. George and R. E. McCulloch (1993) Variable Selection via Gibbs Sampling. Journal of the American Statistical Association 88 (423), pp. 881–889. External Links: ISSN 0162-1459 Cited by: §3.4, §7.
  • W. J. Handley, M. P. Hobson, and A. N. Lasenby (2015) Polychord: Nested Sampling for Cosmology. Monthly Notices of the Royal Astronomical Society: Letters 450 (1), pp. L61–L65 (en). External Links: ISSN 1745-3925 Cited by: §4.1, §7.
  • W. J. Handley, M. P. Hobson, and A. N. Lasenby (2015) Polychord: Next-Generation Nested Sampling. Monthly Notices of the Royal Astronomical Society 453 (4), pp. 4384–4398 (en). External Links: ISSN 0035-8711 Cited by: §4.1, §7.
  • F. P. Hartwig, G. Davey Smith, and J. Bowden (2017) Robust Inference in Summary Data Mendelian Randomization via the Zero Modal Pleiotropy Assumption. International Journal of Epidemiology 46 (6), pp. 1985–1998 (en). External Links: ISSN 0300-5771 Cited by: §2.3.1, §2.3.1.
  • G. Hemani, J. Bowden, and G. Davey Smith (2018) Evaluating the Potential Role of Pleiotropy in Mendelian Randomization Studies. Human Molecular Genetics 27 (R2), pp. 195–208 (en). Cited by: §2.4.
  • G. Hemani, J. Bowden, P. C. Haycock, J. Zheng, O. Davis, P. Flach, T. R. Gaunt, and G. D. Smith (2017) Automating Mendelian Randomization through Machine Learning to Construct a Putative Causal Map of the Human Phenome. bioRxiv, pp. 173682 (en). Cited by: §2.3.1, §2.3, §2.4, item , §7.
  • G. Hemani, K. Tilling, and G. D. Smith (2017) Orienting The Causal Relationship Between Imprecisely Measured Traits Using Genetic Instruments. bioRxiv, pp. 117101 (en). Cited by: §2.3.2, §2.3, §2.4.
  • I. Ibero-Baraibar, S. Navas-Carretero, I. Abete, J. A. Martinez, and M. A. Zulet (2015)

    Increases in plasma 25(OH)D levels are related to improvements in body composition and blood pressure in middle-aged subjects after a weight loss intervention: Longitudinal study

    .
    Clinical Nutrition 34 (5), pp. 1010–1017. External Links: ISSN 0261-5614, Document Cited by: §1.
  • H. Ishwaran and J. S. Rao (2005) Spike and Slab Variable Selection: Frequentist and Bayesian Strategies. The Annals of Statistics 33 (2), pp. 730–773. External Links: ISSN 0090-5364, 2168-8966 Cited by: item , §3.4.
  • E. M. Jones, J. R. Thompson, V. Didelez, and N. A. Sheehan (2012) On the Choice of Parameterisation and Priors for the Bayesian Analyses of Mendelian Randomisation Studies. Statistics in Medicine 31 (14), pp. 1483–1501 (en). External Links: ISSN 1097-0258 Cited by: §3.3, §3.3, §3.4.
  • H. Kang, A. Zhang, T. T. Cai, and D. S. Small (2016) Instrumental Variables Estimation With Some Invalid Instruments and Its Application to Mendelian Randomization. Journal of the American Statistical Association 111 (513), pp. 132–144. External Links: ISSN 0162-1459 Cited by: §2.3.1, §3.4.
  • D. A. Lawlor, R. M. Harbord, J. A. C. Sterne, N. Timpson, and G. Davey Smith (2008) Mendelian Randomization: Using Genes as Instruments for Making Causal Inferences in Epidemiology. Statistics in Medicine 27 (8), pp. 1133–1163 (en). External Links: ISSN 1097-0258 Cited by: §1, §1, §2.2, §3.1.
  • D. A. Lawlor (2016) Commentary: Two-Sample Mendelian Randomization: Opportunities and Challenges. International Journal of Epidemiology 45 (3), pp. 908–915. External Links: ISSN 0300-5771 Cited by: §2.3, §6.2.