Disjunct Support Spike and Slab Priors for Variable Selection in Regression under Quasi-sparseness
Sparseness of the regression coefficient vector is often a desirable property, since, among other benefits, sparseness improves interpretability. In practice, many true regression coefficients might be negligibly small, but non-zero, which we refer to as quasi-sparseness. Spike-and-slab priors as introduced in (Chipman et al., 2001) can be tuned to ignore very small regression coefficients, and, as a consequence provide a trade-off between prediction accuracy and interpretability. However, spike-and-slab priors with full support lead to inconsistent Bayes factors, in the sense that the Bayes factors of any two models are bounded in probability. This is clearly an undesirable property for Bayesian hypotheses testing, where we wish that increasing sample sizes lead to increasing Bayes factors favoring the true model. The moment matching priors as in (Johnson and Rossell, 2012) can resolve this issue, but are unsuitable for the quasi-sparse setting due to their full support outside the exact value 0. As a remedy, we suggest disjunct support spike and slab priors, for which we prove consistent Bayes factors in the quasi-sparse setting, and show experimentally fast growing Bayes factors favoring the true model. Several experiments on simulated and real data confirm the usefulness of our proposed method to identify models with high effect size, while leading to better control over false positives than hard-thresholding.
READ FULL TEXT