Bayesian Approximate Kernel Regression with Variable Selection

08/05/2015
by   Lorin Crawford, et al.
0

Nonlinear kernel regression models are often used in statistics and machine learning because they are more accurate than linear models. Variable selection for kernel regression models is a challenge partly because, unlike the linear regression setting, there is no clear concept of an effect size for regression coefficients. In this paper, we propose a novel framework that provides an effect size analog of each explanatory variable for Bayesian kernel regression models when the kernel is shift-invariant --- for example, the Gaussian kernel. We use function analytic properties of shift-invariant reproducing kernel Hilbert spaces (RKHS) to define a linear vector space that: (i) captures nonlinear structure, and (ii) can be projected onto the original explanatory variables. The projection onto the original explanatory variables serves as an analog of effect sizes. The specific function analytic property we use is that shift-invariant kernel functions can be approximated via random Fourier bases. Based on the random Fourier expansion we propose a computationally efficient class of Bayesian approximate kernel regression (BAKR) models for both nonlinear regression and binary classification for which one can compute an analog of effect sizes. We illustrate the utility of BAKR by examining two important problems in statistical genetics: genomic selection (i.e. phenotypic prediction) and association mapping (i.e. inference of significant variants or loci). State-of-the-art methods for genomic selection and association mapping are based on kernel regression and linear models, respectively. BAKR is the first method that is competitive in both settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2018

Large-scale Nonlinear Variable Selection via Kernel Random Features

We propose a new method for input variable selection in nonlinear regres...
research
03/13/2019

A novel Bayesian approach for variable selection in linear regression models

We propose a novel Bayesian approach to the problem of variable selectio...
research
06/06/2018

Deep Bayesian regression models

Regression models are used for inference and prediction in a wide range ...
research
12/09/2020

Consistent regression of biophysical parameters with kernel methods

This paper introduces a novel statistical regression framework that allo...
research
01/22/2018

Predictor Variable Prioritization in Nonlinear Models: A Genetic Association Case Study

The central aim in this paper is to address variable selection questions...
research
07/04/2023

Scalable variable selection for two-view learning tasks with projection operators

In this paper we propose a novel variable selection method for two-view ...
research
04/10/2023

Scalable Randomized Kernel Methods for Multiview Data Integration and Prediction

We develop scalable randomized kernel methods for jointly associating da...

Please sign up or login with your details

Forgot password? Click here to reset