DeepAI AI Chat
Log In Sign Up

Stochastic Approximation EM for Logistic Regression with Missing Values

by   Wei Jiang, et al.

Logistic regression is a common classification method in supervised learning. Surprisingly, there are very few solutions for performing it and selecting variables in the presence of missing values. We propose a stochastic approximation version of the EM algorithm based on Metropolis-Hasting sampling, to perform statistical inference for logistic regression with incomplete data. We propose a complete approach, including the estimation of parameters and their variance, derivation of confidence intervals, a model selection procedure, and a method for prediction on test sets with missing values. The method is computationally efficient, and its good coverage and variable selection properties are demonstrated in a simulation study. We then illustrate the method on a dataset of polytraumatized patients from Paris hospitals to predict the occurrence of hemorrhagic shock, a leading cause of early preventable death in severe trauma cases. The aim is to consolidate the current red flag procedure, a binary alert identifying patients with a high risk of severe hemorrhage. The methodology is implemented in the R package misaem.


page 1

page 2

page 3

page 4


Statistical Inference for Genetic Relatedness Based on High-Dimensional Logistic Regression

This paper studies the problem of statistical inference for genetic rela...

Adaptive Bayesian SLOPE – High-dimensional Model Selection with Missing Values

The selection of variables with high-dimensional and missing data is a m...

Regression Analysis of Proportion Outcomes with Random Effects

A regression method for proportional, or fractional, data with mixed eff...

Factors associated with injurious from falls in people with early stage Parkinson's disease

Falls are common in people with Parkinson's disease (PD) and have detrim...