Online Debiasing for Adaptively Collected High-dimensional Data

11/04/2019
by   Yash Deshpande, et al.
15

Adaptive collection of data is increasingly commonplace in many applications. From the point of view of statistical inference however, adaptive collection induces memory and correlation in the samples, and poses significant challenge. We consider the high-dimensional linear regression, where the samples are collected adaptively and the sample size n can be smaller than p, the number of covariates. In this setting, there are two distinct sources of bias: the first due to regularization imposed for estimation, e.g. using the LASSO, and the second due to adaptivity in collecting the samples. We propose `online debiasing', a general procedure for estimators such as the LASSO, which addresses both sources of bias. In two concrete contexts (i) batched data collection and (ii) high-dimensional time series analysis, we demonstrate that online debiasing optimally debiases the LASSO estimate when the underlying parameter θ_0 has sparsity of order o(√(n)/log p). In this regime, the debiased estimator can be used to compute p-values and confidence intervals of optimal size.

READ FULL TEXT

page 7

page 17

page 18

page 24

page 27

page 28

page 29

page 34

research
11/09/2017

Debiasing the Debiased Lasso with Bootstrap

In this paper, we prove that under proper conditions, bootstrap can furt...
research
08/18/2022

Small Tuning Parameter Selection for the Debiased Lasso

In this study, we investigate the bias and variance properties of the de...
research
12/18/2017

Accurate Inference for Adaptive Linear Models

Estimators computed from adaptively collected data do not behave like th...
research
05/31/2022

To Collaborate or Not in Distributed Statistical Estimation with Resource Constraints?

We study how the amount of correlation between observations collected by...
research
09/20/2020

Confidence intervals for parameters in high-dimensional sparse vector autoregression

Vector autoregression (VAR) models are widely used to analyze the interr...
research
03/20/2019

Behavior of Lasso and Lasso-based inference under limited variability

We study the nonasymptotic behavior of Lasso and Lasso-based inference w...
research
06/15/2018

Statistical Inference with Ensemble of Clustered Desparsified Lasso

Medical imaging involves high-dimensional data, yet their acquisition is...

Please sign up or login with your details

Forgot password? Click here to reset