What to Expect of Classifiers? Reasoning about Logistic Regression with Missing Features

03/05/2019
by   Pasha Khosravi, et al.
17

While discriminative classifiers often yield strong predictive performance, missing feature values at prediction time can still be a challenge. Classifiers may not behave as expected under certain ways of substituting the missing values, since they inherently make assumptions about the data distribution they were trained on. In this paper, we propose a novel framework that classifies examples with missing features by computing the expected prediction on a given feature distribution. We then use geometric programming to learn a naive Bayes distribution that embeds a given logistic regression classifier and can efficiently take its expected predictions. Empirical evaluations show that our model achieves the same performance as the logistic regression with all features observed, and outperforms standard imputation techniques when features go missing during prediction time. Furthermore, we demonstrate that our method can be used to generate 'sufficient explanations' of logistic regression classifications, by removing features that do not affect the classification.

READ FULL TEXT
research
08/08/2021

A Theoretical Analysis of Logistic Regression and Bayesian Classifiers

This study aims to show the fundamental difference between logistic regr...
research
10/05/2019

On Tractable Computation of Expected Predictions

Computing expected predictions has many interesting applications in area...
research
12/18/2020

Classification with Strategically Withheld Data

Machine learning techniques can be useful in applications such as credit...
research
09/07/2020

Empirical Bayes methods for monitoring health care quality

The paper discusses empirical Bayes methodology for repeated quality com...
research
09/18/2020

On the Tractability of SHAP Explanations

SHAP explanations are a popular feature-attribution mechanism for explai...
research
03/01/2018

Interval-based Prediction Uncertainty Bound Computation in Learning with Missing Values

The problem of machine learning with missing values is common in many ar...
research
12/27/2018

Classification of radiology reports by modality and anatomy: A comparative study

Data labeling is currently a time-consuming task that often requires exp...

Please sign up or login with your details

Forgot password? Click here to reset