Feature Representation for ICU Mortality
Good predictors of ICU Mortality have the potential to identify high-risk patients earlier, improve ICU resource allocation, or create more accurate population-level risk models. Machine learning practitioners typically make choices about how to represent features in a particular model, but these choices are seldom evaluated quantitatively. This study compares the performance of different representations of clinical event data from MIMIC II in a logistic regression model to predict 36-hour ICU mortality. The most common representations are linear (normalized counts) and binary (yes/no). These, along with a new representation termed "hill", are compared using both L1 and L2 regularization. Results indicate that the introduced "hill" representation outperforms both the binary and linear representations, the hill representation thus has the potential to improve existing models of ICU mortality.
READ FULL TEXT