Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors

by   Samuel Ackerman, et al.

Machine learning (ML) solutions are prevalent in many applications. However, many challenges exist in making these solutions business-grade. For instance, maintaining the error rate of the underlying ML models at an acceptably low level. Typically, the true relationship between feature inputs and the target feature to be predicted is uncertain, and hence statistical in nature. The approach we propose is to separate the observations that are the most likely to be predicted incorrectly into 'attention sets'. These can directly aid model diagnosis and improvement, and be used to decide on alternative courses of action for these problematic observations. We present several algorithms (`strategies') for determining optimal rules to separate these observations. In particular, we prefer strategies that use feature-based slicing because they are human-interpretable, model-agnostic, and require minimal supplementary inputs or knowledge. In addition, we show that these strategies outperform several common baselines, such as selecting observations with prediction confidence below a threshold. To evaluate strategies, we introduce metrics to measure various desired qualities, such as their performance, stability, and generalizability to unseen data; the strategies are evaluated on several publicly-available datasets. We use TOPSIS, a Multiple Criteria Decision Making method, to aggregate these metrics into a single quality score for each strategy, to allow comparison.


page 1

page 2

page 3

page 4


Machine Learning Pipeline for Pulsar Star Dataset

This work brings together some of the most common machine learning (ML) ...

Classifier Data Quality: A Geometric Complexity Based Method for Automated Baseline And Insights Generation

Testing Machine Learning (ML) models and AI-Infused Applications (AIIAs)...

Model Agnostic Defence against Backdoor Attacks in Machine Learning

Machine Learning (ML) has automated a multitude of our day-to-day decisi...

Prediction Confidence from Neighbors

The inability of Machine Learning (ML) models to successfully extrapolat...

Model Sketching: Centering Concepts in Early-Stage Machine Learning Model Design

Machine learning practitioners often end up tunneling on low-level techn...

FreaAI: Automated extraction of data slices to test machine learning models

Machine learning (ML) solutions are prevalent. However, many challenges ...

The Principle of Uncertain Maximum Entropy

The principle of maximum entropy, as introduced by Jaynes in information...

Please sign up or login with your details

Forgot password? Click here to reset