The Existence of Maximum Likelihood Estimate in High-Dimensional Generalized Linear Models with Binary Responses
Motivated by recent works on the high-dimensional logistic regression, we establish that the existence of the maximum likelihood estimate (MLE) exhibits a phase transition for a wide range of generalized linear models (GLMs) with binary responses and elliptical covariates. This extends a previous result of Candès and Sur who proved the phase transition for the logistic regression with Gaussian covariates. Precisely, we consider the high-dimensional regime in which the number of observations n and the number of covariates p proportioned, i.e. p/n →κ. We provide an explicit threshold h_MLE depending on the unknown regression coefficients and the scaling parameter of covariates such that in high dimensional regime, if κ > h_MLE, then the MLE does not exist with probability 1, and if κ < h_MLE, then the MLE exists with probability 1. The main tools for deriving the result are data separation, convex geometry and stochastic approximation. We also conduct simulation studies to corroborate our theoretical findings, and explore other features of the problem.
READ FULL TEXT