Targeted Random Projection for Prediction from High-Dimensional Features

12/06/2017
by   Minerva Mukhopadhyay, et al.
0

We consider the problem of computationally-efficient prediction from high dimensional and highly correlated predictors in challenging settings where accurate variable selection is effectively impossible. Direct application of penalization or Bayesian methods implemented with Markov chain Monte Carlo can be computationally daunting and unstable. Hence, some type of dimensionality reduction prior to statistical analysis is in order. Common solutions include application of screening algorithms to reduce the regressors, or dimension reduction using projections of the design matrix. The former approach can be highly sensitive to threshold choice in finite samples, while the later can have poor performance in very high-dimensional settings. We propose a TArgeted Random Projection (TARP) approach that combines positive aspects of both strategies to boost performance. In particular, we propose to use information from independent screening to order the inclusion probabilities of the features in the projection matrix used for dimension reduction, leading to data-informed sparsity. We provide theoretical support for a Bayesian predictive algorithm based on TARP, including both statistical and computational complexity guarantees. Examples for simulated and real data applications illustrate gains relative to a variety of competitors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2021

A Metropolized adaptive subspace algorithm for high-dimensional Bayesian variable selection

A simple and efficient adaptive Markov Chain Monte Carlo (MCMC) method, ...
research
10/22/2021

Adaptive random neighbourhood informed Markov chain Monte Carlo for high-dimensional Bayesian variable Selection

We introduce a framework for efficient Markov Chain Monte Carlo (MCMC) a...
research
03/04/2013

Bayesian Compressed Regression

As an alternative to variable selection or shrinkage in high dimensional...
research
04/28/2021

BayesSUR: An R package for high-dimensional multivariate Bayesian variable and covariance selection in linear regression

In molecular biology, advances in high-throughput technologies have made...
research
05/27/2019

Efficient posterior sampling for high-dimensional imbalanced logistic regression

High-dimensional data are routinely collected in many application areas....
research
09/08/2017

Likelihood informed dimension reduction for inverse problems in remote sensing of atmospheric constituent profiles

We use likelihood informed dimension reduction (LIS) (T. Cui et al. 2014...
research
12/12/2013

Sparse Matrix-based Random Projection for Classification

As a typical dimensionality reduction technique, random projection can b...

Please sign up or login with your details

Forgot password? Click here to reset