Concentration Based Inference in High Dimensional Generalized Regression Models (I: Statistical Guarantees)

08/17/2018
by   Ying Zhu, et al.
0

We develop simple and non-asymptotically justified methods for hypothesis testing about the coefficients (θ^*∈R^p) in the high dimensional generalized regression models where p can exceed the sample size. Given a function h: R^pR^m, we consider H_0: h(θ^*) = 0_m against H_1: h(θ^*)≠0_m, where m can be any integer in [1, p] and h can be nonlinear in θ^*. Our test statistics is based on the sample "quasi score" vector evaluated at an estimate θ̂_α that satisfies h(θ̂_α)=0_m, where α is the prespecified Type I error. By exploiting the concentration phenomenon in Lipschitz functions, the key component reflecting the dimension complexity in our non-asymptotic thresholds uses a Monte-Carlo approximation to mimic the expectation that is concentrated around and automatically captures the dependencies between the coordinates. We provide probabilistic guarantees in terms of the Type I and Type II errors for the quasi score test. Confidence regions are also constructed for the population quasi-score vector evaluated at θ^*. The first set of our results are specific to the standard Gaussian linear regression models; the second set allow for reasonably flexible forms of non-Gaussian responses, heteroscedastic noise, and nonlinearity in the regression coefficients, while only requiring the correct specification of E(Y_i | X_i)s. The novelty of our methods is that their validity does not rely on good behavior of θ̂_α - θ^*_2 (or even n^-1/2 X(θ̂_α - θ^*)_2 in the linear regression case) nonasymptotically or asymptotically.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2018

Concentration Based Inference for High Dimensional (Generalized) Regression Models: New Phenomena in Hypothesis Testing

We develop simple and non-asymptotically justified methods for hypothesi...
research
03/09/2018

On frequentist coverage errors of Bayesian credible sets in high dimensions

In this paper, we study frequentist coverage errors of Bayesian credible...
research
11/27/2020

Two-sample testing of high-dimensional linear regression coefficients via complementary sketching

We introduce a new method for two-sample testing of high-dimensional lin...
research
09/09/2019

Robust testing in generalized linear models by sign-flipping score contributions

Generalized linear models are often misspecified due to overdispersion, ...
research
10/22/2018

Comparing Two Approaches in Heteroscedastic Regression Models

Recently, a generalized test approach is proposed by Sadooghi-alvandi et...
research
06/04/2020

Inject Machine Learning into Significance Test for Misspecified Linear Models

Due to its strong interpretability, linear regression is widely used in ...
research
03/08/2022

Gaussian quasi-information criteria for ergodic Lévy driven SDE

We consider relative model comparison for the parametric coefficients of...

Please sign up or login with your details

Forgot password? Click here to reset