varbvs: Fast Variable Selection for Large-scale Regression

09/19/2017
by   Peter Carbonetto, et al.
0

We introduce varbvs, a suite of functions written in R and MATLAB for regression analysis of large-scale data sets using Bayesian variable selection methods. We have developed numerical optimization algorithms based on variational approximation methods that make it feasible to apply Bayesian variable selection to very large data sets. With a focus on examples from genome-wide association studies, we demonstrate that varbvs scales well to data sets with hundreds of thousands of variables and thousands of samples, and has features that facilitate rapid data analyses. Moreover, varbvs allows for extensive model customization, which can be used to incorporate external information into the analysis. We expect that the combination of an easy-to-use interface and robust, scalable algorithms for posterior computation will encourage broader use of Bayesian variable selection in areas of applied statistics and computational biology. The most recent R and MATLAB source code is available for download at Github (https://github.com/pcarbo/varbvs), and the R package can be installed from CRAN (https://cran.r-project.org/package=varbvs).

READ FULL TEXT

page 1

page 15

research
03/28/2018

BIVAS: A scalable Bayesian method for bi-level variable selection with applications

In this paper, we consider a Bayesian bi-level variable selection proble...
research
11/30/2021

Efficient and robust high-dimensional sparse logistic regression via nonlinear primal-dual hybrid gradient algorithms

Logistic regression is a widely used statistical model to describe the r...
research
05/13/2022

A Relaxation Approach to Feature Selection for Linear Mixed Effects Models

Linear Mixed-Effects (LME) models are a fundamental tool for modeling co...
research
10/25/2022

Redistributor: Transforming Empirical Data Distributions

We present an algorithm and package, Redistributor, which forces a colle...
research
12/17/2018

Variational Discriminant Analysis with Variable Selection

A Bayesian method that seamlessly fuses classification via discriminant ...
research
03/16/2020

Variable selection with multiply-imputed datasets: choosing between stacked and grouped methods

Penalized regression methods, such as lasso and elastic net, are used in...
research
12/21/2022

kalis: A Modern Implementation of the Li Stephens Model for Local Ancestry Inference in R

Approximating the recent phylogeny of N phased haplotypes at a set of va...

Please sign up or login with your details

Forgot password? Click here to reset