Goodness-of-fit testing in high-dimensional generalized linear models

08/09/2019
by   Jana Jankova, et al.
0

We propose a family of tests to assess the goodness-of-fit of a high-dimensional generalized linear model. Our framework is flexible and may be used to construct an omnibus test or directed against testing specific non-linearities and interaction effects, or for testing the significance of groups of variables. The methodology is based on extracting left-over signal in the residuals from an initial fit of a generalized linear model. This can be achieved by predicting this signal from the residuals using modern flexible regression or machine learning methods such as random forests or boosted trees. Under the null hypothesis that the generalized linear model is correct, no signal is left in the residuals and our test statistic has a Gaussian limiting distribution, translating to asymptotic control of type I error. Under a local alternative, we establish a guarantee on the power of the test. We illustrate the effectiveness of the methodology on simulated and real data examples by testing goodness-of-fit in logistic regression models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2018

Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models

High-dimensional logistic regression is widely used in analyzing data wi...
research
07/21/2020

A Generalized Hosmer-Lemeshow Goodness-of-Fit Test for a Family of Generalized Linear Models

Generalized linear models (GLMs) are used within a vast number of applic...
research
09/24/2019

Double-estimation-friendly inference for high-dimensional misspecified models

All models may be wrong—but that is not necessarily a problem for infere...
research
11/08/2019

A Binary Regression Adaptive Goodness-of-fit Test (BAGofT)

The Pearson's χ^2 test and residual deviance test are two classical good...
research
11/03/2022

The Projected Covariance Measure for assumption-lean variable significance testing

Testing the significance of a variable or group of variables X for predi...
research
12/05/2022

Testing for Regression Heteroskedasticity with High-Dimensional Random Forests

Statistical inference for high-dimensional regression heteroskedasticity...
research
06/10/2018

Generalized Goodness-Of-Fit Tests for Correlated Data

This paper concerns the problem of applying the generalized goodness-of-...

Please sign up or login with your details

Forgot password? Click here to reset