EPP: interpretable score of model predictive power

08/24/2019
by   Alicja Gosiewska, et al.
0

The most important part of model selection and hyperparameter tuning is the evaluation of model performance. The most popular measures, such as AUC, F1, ACC for binary classification, or RMSE, MAD for regression, or cross-entropy for multilabel classification share two common weaknesses. First is, that they are not on an interval scale. It means that the difference in performance for the two models has no direct interpretation. It makes no sense to compare such differences between datasets. Second is, that for k-fold cross-validation, the model performance is in most cases calculated as an average performance from particular folds, which neglects the information how stable is the performance for different folds. In this talk, we introduce a new EPP rating system for predictive models. We also demonstrate numerous advantages for this system, First, differences in EPP scores have probabilistic interpretation. Based on it we can assess the probability that one model will achieve better performance than another. Second, EPP scores can be directly compared between datasets. Third, they can be used for navigated hyperparameter tuning and model selection. Forth, we can create embeddings for datasets based on EPP scores.

READ FULL TEXT
research
06/02/2020

Interpretable Meta-Measure for Model Performance

Measures for evaluation of model performance play an important role in M...
research
08/16/2019

Selection of Exponential-Family Random Graph Models via Held-Out Predictive Evaluation (HOPE)

Statistical models for networks with complex dependencies pose particula...
research
09/11/2019

Counterfactual Cross-Validation: Effective Causal Model Selection from Observational Data

What is the most effective way to select the best causal model among pot...
research
11/27/2014

Convex Techniques for Model Selection

We develop a robust convex algorithm to select the regularization parame...
research
03/29/2018

Performance evaluation and hyperparameter tuning of statistical and machine-learning models using spatial data

Machine-learning algorithms have gained popularity in recent years in th...
research
06/19/2018

Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models

K-fold cross validation (CV) is a popular method for estimating the true...
research
11/18/2022

Prediction scoring of data-driven discoveries for reproducible research

Predictive modeling uncovers knowledge and insights regarding a hypothes...

Please sign up or login with your details

Forgot password? Click here to reset