Concentration Based Inference for High Dimensional (Generalized) Regression Models: New Phenomena in Hypothesis Testing

08/17/2018
by   Ying Zhu, et al.
0

We develop simple and non-asymptotically justified methods for hypothesis testing about the coefficients (θ^*∈R^p) in the high dimensional (generalized) regression models where p can exceed the sample size n. We consider H_0: h(θ^*)=0_m against H_1: h(θ^*)≠0_m, where m can be as large as p and h can be nonlinear in θ^*. Our test statistics is based on the sample score vector evaluated at an estimate θ̂_α that satisfies h(θ̂_α)=0_m, where α is the prespecified Type I error. We provide nonasymptotic control on the Type I and Type II errors for the score test, as well as confidence regions. By exploiting the concentration phenomenon in Lipschitz functions, the key component reflecting the "dimension complexity" in our non-asymptotic thresholds uses a Monte-Carlo approximation to "mimic" the expectation that is concentrated around and automatically captures the dependencies between the coordinates. The novelty of our methods is that their validity does not rely on good behavior of θ̂_α-θ^* _2 or even n^-1/2 X(θ̂_α-θ^*) _2 nonasymptotically or asymptotically. Most interestingly, we discover phenomena that are opposite from the existing literature: (1) More restrictions (larger m) in H_0 make our procedures more powerful, (2) whether θ^* is sparse or not, it is possible for our procedures to detect alternatives with probability at least 1-Type II error when p≥ n and m>p-n, (3) the coverage probability of our procedures is not affected by how sparse θ^* is. The proposed procedures are evaluated with simulation studies, where the empirical evidence supports our key insights.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2018

Concentration Based Inference in High Dimensional Generalized Regression Models (I: Statistical Guarantees)

We develop simple and non-asymptotically justified methods for hypothesi...
research
05/17/2018

Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models

High-dimensional logistic regression is widely used in analyzing data wi...
research
04/15/2023

Tests for ultrahigh-dimensional partially linear regression models

In this paper, we consider tests for ultrahigh-dimensional partially lin...
research
06/13/2021

Finite-Length Bounds on Hypothesis Testing Subject to Vanishing Type I Error Restrictions

A central problem in Binary Hypothesis Testing (BHT) is to determine the...
research
01/02/2019

Inference for spherical location under high concentration

Motivated by the fact that many circular or spherical data are highly co...
research
10/17/2022

Simultaneous Inference in Non-Sparse High-Dimensional Linear Models

Inference and prediction under the sparsity assumption have been a hot r...
research
07/02/2021

Generalized Multivariate Signs for Nonparametric Hypothesis Testing in High Dimensions

High-dimensional data, where the dimension of the feature space is much ...

Please sign up or login with your details

Forgot password? Click here to reset