The Asymptotic Distribution of the MLE in High-dimensional Logistic Models: Arbitrary Covariance

01/25/2020
by   Qian Zhao, et al.
0

We study the distribution of the maximum likelihood estimate (MLE) in high-dimensional logistic models, extending the recent results from Sur (2019) to the case where the Gaussian covariates may have an arbitrary covariance structure. We prove that in the limit of large problems holding the ratio between the number p of covariates and the sample size n constant, every finite list of MLE coordinates follows a multivariate normal distribution. Concretely, the jth coordinate β̂_j of the MLE is asymptotically normally distributed with mean α_β_j and standard deviation σ_/τ_j; here, β_j is the value of the true regression coefficient, and τ_j the standard deviation of the jth predictor conditional on all the others. The numerical parameters α_ > 1 and σ_ only depend upon the problem dimensionality p/n and the overall signal strength, and can be accurately estimated. Our results imply that the MLE's magnitude is biased upwards and that the MLE's standard deviation is greater than that predicted by classical theory. We present a series of experiments on simulated and real data showing excellent agreement with the theory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2018

The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression

This paper rigorously establishes that the existence of the maximum like...
research
05/28/2023

Multinomial Logistic Regression: Asymptotic Normality on Null Covariates in High-Dimensions

This paper investigates the asymptotic distribution of the maximum-likel...
research
08/18/2022

An Adaptively Resized Parametric Bootstrap for Inference in High-dimensional Generalized Linear Models

Accurate statistical inference in logistic regression models remains a c...
research
03/19/2018

A modern maximum-likelihood theory for high-dimensional logistic regression

Every student in statistics or data science learns early on that when th...
research
03/23/2021

SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression

Logistic regression remains one of the most widely used tools in applied...
research
02/19/2018

Maximum value of the standardized log of odds ratio and celestial mechanics

The odds ratio (OR) is a widely used measure of the effect size in obser...
research
07/20/2021

Directional testing for high-dimensional multivariate normal distributions

Thanks to its favorable properties, the multivariate normal distribution...

Please sign up or login with your details

Forgot password? Click here to reset