High stakes classification with multiple unknown classes based on imperfect data

04/26/2023
by   Haakon Bakka, et al.
0

High stakes classification refers to classification problems where erroneously predicting the wrong class is very bad, but assigning "unknown" is acceptable. We make the argument that these problems require us to give multiple unknown classes, to get the most information out of our analysis. With imperfect data we refer to covariates with a large number of missing values, large noise variance, and some errors in the data. The combination of high stakes classification and imperfect data is very common in practice, but it is very difficult to work on using current methods. We present a one-class classifier (OCC) to solve this problem, and we call it NBP. The classifier is based on Naive Bayes, simple to implement, and interpretable. We show that NBP gives both good predictive performance, and works for high stakes classification based on imperfect data. The model we present is quite simple; it is just an OCC based on density estimation. However, we have always felt a big gap between the applied classification problems we have worked on and the theory and models we use for classification, and this model closes that gap. Our main contribution is the motivation for why this model is a good approach, and we hope that this paper will inspire further development down this path.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/04/2019

Fast Multi-Class Probabilistic Classifier by Sparse Non-parametric Density Estimation

The model interpretation is essential in many application scenarios and ...
research
09/08/2023

Circles: Inter-Model Comparison of Multi-Classification Problems with High Number of Classes

The recent advancements in machine learning have motivated researchers t...
research
01/23/2013

A Bayesian Network Classifier that Combines a Finite Mixture Model and a Naive Bayes Model

In this paper we present a new Bayesian network model for classification...
research
04/13/2023

Bayes classifier cannot be learned from noisy responses with unknown noise rates

Training a classifier with noisy labels typically requires the learner t...
research
08/09/2017

Classification without labels: Learning from mixed samples in high energy physics

Modern machine learning techniques can be used to construct powerful mod...
research
06/13/2018

Benchmarks for Image Classification and Other High-dimensional Pattern Recognition Problems

A good classification method should yield more accurate results than sim...
research
03/05/2021

SCRIB: Set-classifier with Class-specific Risk Bounds for Blackbox Models

Despite deep learning (DL) success in classification problems, DL classi...

Please sign up or login with your details

Forgot password? Click here to reset