A Bayesian Nonparametric Conditional Two-sample Test with an Application to Local Causal Discovery
The performance of constraint-based causal discovery algorithms is prominently determined by the performance of the (conditional) independence tests that are being used. A default choice for the (conditional) independence test is the (partial) correlation test, which can fail in presence of nonlinear relations between the variables. Recent research proposes a Bayesian nonparametric two-sample test (Holmes et al., 2015), an independence test between continuous variables (Filippi and Holmes, 2017), and a conditional independence test between continuous variables (Teymur and Filippi, 2019). We extend this work by proposing a novel Bayesian nonparametric conditional two-sample test. We utilise this conditional two-sample test for testing the conditional independence C ⊥⊥ Y |X where C denotes a Bernoulli random variable, and X and Y are continuous one-dimensional random variables. This enables a nonparametric implementation of the Local Causal Discovery (LCD) algorithm with binary variables in the experimental setup (e.g. an indicator of treatment/control group). We propose a fair performance measure for comparing frequentist and Bayesian tests in the LCD setting. We utilise this performance measure for comparing our Bayesian ensemble with state-of-the-art frequentist tests, and conclude that the Bayesian ensemble has better performance than its frequentist counterparts. We apply our nonparametric implementation of the LCD algorithm to protein expression data.
READ FULL TEXT