The False Positive Control Lasso

03/29/2019
by   Erik Drysdale, et al.
0

In high dimensional settings where a small number of regressors are expected to be important, the Lasso estimator can be used to obtain a sparse solution vector with the expectation that most of the non-zero coefficients are associated with true signals. While several approaches have been developed to control the inclusion of false predictors with the Lasso, these approaches are limited by relying on asymptotic theory, having to empirically estimate terms based on theoretical quantities, assuming a continuous response class with Gaussian noise and design matrices, or high computation costs. In this paper we show how: (1) an existing model (the SQRT-Lasso) can be recast as a method of controlling the number of expected false positives, (2) how a similar estimator can used for all other generalized linear model classes, and (3) this approach can be fit with existing fast Lasso optimization solvers. Our justification for false positive control using randomly weighted self-normalized sum theory is to our knowledge novel. Moreover, our estimator's properties hold in finite samples up to some approximation error which we find in practical settings to be negligible under a strict mutual incoherence condition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2018

Model selection with lasso-zero: adding straw to the haystack to better find needles

The high-dimensional linear model y = X β^0 + ϵ is considered and the fo...
research
11/05/2015

False Discoveries Occur Early on the Lasso Path

In regression settings where explanatory variables have very low correla...
research
03/26/2019

Non-asymptotic error controlled sparse high dimensional precision matrix estimation

Estimation of a high dimensional precision matrix is a critical problem ...
research
04/24/2020

Estimating the Lasso's Effective Noise

Much of the theory for the lasso in the linear model Y = X β^* + ε hinge...
research
06/24/2023

Information criteria for structured parameter selection in high dimensional tree and graph models

Parameter selection in high-dimensional models is typically finetuned in...
research
11/07/2018

Bicoherence analysis of nonstationary and nonlinear processes

Bicoherence analysis is a well established method for identifying the qu...
research
08/02/2021

Sequential Multivariate Change Detection with Calibrated and Memoryless False Detection Rates

Responding appropriately to the detections of a sequential change detect...

Please sign up or login with your details

Forgot password? Click here to reset