Supersparse Linear Integer Models for Optimized Medical Scoring Systems

02/15/2015
by   Berk Ustun, et al.
0

Scoring systems are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are in widespread use by the medical community, but are difficult to learn from data because they need to be accurate and sparse, have coprime integer coefficients, and satisfy multiple operational constraints. We present a new method for creating data-driven scoring systems called a Supersparse Linear Integer Model (SLIM). SLIM scoring systems are built by solving an integer program that directly encodes measures of accuracy (the 0-1 loss) and sparsity (the ℓ_0-seminorm) while restricting coefficients to coprime integers. SLIM can seamlessly incorporate a wide range of operational constraints related to accuracy and sparsity, and can produce highly tailored models without parameter tuning. We provide bounds on the testing and training accuracy of SLIM scoring systems, and present a new data reduction technique that can improve scalability by eliminating a portion of the training data beforehand. Our paper includes results from a collaboration with the Massachusetts General Hospital Sleep Laboratory, where SLIM was used to create a highly tailored scoring system for sleep apnea screening

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/25/2013

Supersparse Linear Integer Models for Predictive Scoring Systems

We introduce Supersparse Linear Integer Models (SLIM) as a tool to creat...
research
05/16/2014

Methods and Models for Interpretable Linear Classification

We present an integer programming framework to build accurate and interp...
research
10/01/2016

Learning Optimized Risk Scores on Large-Scale Datasets

Risk scores are simple classification models that let users quickly asse...
research
06/27/2013

Supersparse Linear Integer Models for Interpretable Classification

Scoring systems are classification models that only require users to add...
research
11/26/2019

Interactivity and Transparency in Medical Risk Assessment with Supersparse Linear Integer Models

Scoring systems are linear classifcation models that only require users ...
research
08/21/2020

Automatic sleep stage classification with deep residual networks in a mixed-cohort setting

Study Objectives: Sleep stage scoring is performed manually by sleep exp...
research
08/01/2020

Experiments in Extractive Summarization: Integer Linear Programming, Term/Sentence Scoring, and Title-driven Models

In this paper, we revisit the challenging problem of unsupervised single...

Please sign up or login with your details

Forgot password? Click here to reset