AutoSlicer: Scalable Automated Data Slicing for ML Model Analysis

12/18/2022
by   Zifan Liu, et al.
0

Automated slicing aims to identify subsets of evaluation data where a trained model performs anomalously. This is an important problem for machine learning pipelines in production since it plays a key role in model debugging and comparison, as well as the diagnosis of fairness issues. Scalability has become a critical requirement for any automated slicing system due to the large search space of possible slices and the growing scale of data. We present Autoslicer, a scalable system that searches for problematic slices through distributed metric computation and hypothesis testing. We develop an efficient strategy that reduces the search space through pruning and prioritization. In the experiments, we show that our search strategy finds most of the anomalous slices by inspecting a small portion of the search space.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2018

Slice Finder: Automated Data Sclicing for Model Validation

As machine learning (ML) systems become democratized, it becomes increas...
research
07/16/2018

Slice Finder: Automated Data Slicing for Model Validation

As machine learning (ML) systems become democratized, it becomes increas...
research
07/16/2018

Automated Data Slicing for Model Validation:A Big data - AI Integration Approach

As machine learning systems become democratized, it becomes increasingly...
research
02/01/2023

Faster Convergence with Lexicase Selection in Tree-based Automated Machine Learning

In many evolutionary computation systems, parent selection methods can a...
research
02/13/2019

ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning

To relieve the pain of manually selecting machine learning algorithms an...
research
02/10/2023

Shrinking the Inductive Programming Search Space with Instruction Subsets

Inductive programming frequently relies on some form of search in order ...
research
10/18/2019

b-Bit Sketch Trie: Scalable Similarity Search on Integer Sketches

Recently, randomly mapping vectorial data to strings of discrete symbols...

Please sign up or login with your details

Forgot password? Click here to reset