Robust Inference Under Heteroskedasticity via the Hadamard Estimator

07/01/2018
by   Edgar Dobriban, et al.
0

Drawing statistical inferences from large datasets in a model-robust way is an important problem in statistics and data science. In this paper, we propose methods that are robust to large and unequal noise in different observational units (i.e., heteroskedasticity) for statistical inference in linear regression. We leverage the Hadamard estimator, which is unbiased for the variances of ordinary least-squares regression. This is in contrast to the popular White's sandwich estimator, which can be substantially biased in high dimensions. We propose to estimate the signal strength, noise level, signal-to-noise ratio, and mean squared error via the Hadamard estimator. We develop a new degrees of freedom adjustment that gives more accurate confidence intervals than variants of White's sandwich estimator. Moreover, we provide conditions ensuring the estimator is well-defined, by studying a new random matrix ensemble in which the entries of a random orthogonal projection matrix are squared. We also show approximate normality, using the second-order Poincare inequality. Our work provides improved statistical theory and methods for linear regression in high dimensions.

READ FULL TEXT
research
07/03/2023

Inference for Projection Parameters in Linear Regression: beyond d = o(n^1/2)

We consider the problem of inference for projection parameters in linear...
research
02/24/2020

Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms

The statistical analysis of Randomized Numerical Linear Algebra (RandNLA...
research
06/09/2023

Causal Effect Estimation from Observational and Interventional Data Through Matrix Weighted Linear Estimators

We study causal effect estimation from a mixture of observational and in...
research
04/30/2021

Estimation and Selection Properties of the LAD Fused Lasso Signal Approximator

The fused lasso is an important method for signal processing when the hi...
research
10/14/2018

A New Theory for Sketching in Linear Regression

Large datasets create opportunities as well as analytic challenges. A re...
research
06/07/2023

Using Large Language Model Annotations for Valid Downstream Statistical Inference in Social Science: Design-Based Semi-Supervised Learning

In computational social science (CSS), researchers analyze documents to ...
research
03/27/2023

Adjusted Wasserstein Distributionally Robust Estimator in Statistical Learning

We propose an adjusted Wasserstein distributionally robust estimator – b...

Please sign up or login with your details

Forgot password? Click here to reset