Preprocessor Selection for Machine Learning Pipelines

10/23/2018
by   Brandon Schoenfeld, et al.
0

Much of the work in metalearning has focused on classifier selection, combined more recently with hyperparameter optimization, with little concern for data preprocessing. Yet, it is generally well accepted that machine learning applications require not only model building, but also data preprocessing. In other words, practical solutions consist of pipelines of machine learning operators rather than single algorithms. Interestingly, our experiments suggest that, on average, data preprocessing hinders accuracy, while the best performing pipelines do actually make use of preprocessors. Here, we conduct an extensive empirical study over a wide range of learning algorithms and preprocessors, and use metalearning to determine when one should make use of preprocessors in ML pipeline design.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2017

nuts-flow/ml: data pre-processing for deep learning

Data preprocessing is a fundamental part of any machine learning applica...
research
04/01/2019

Adaptive Bayesian Linear Regression for Automated Machine Learning

To solve a machine learning problem, one typically needs to perform data...
research
07/01/2019

Two-stage Optimization for Machine Learning Workflow

Machines learning techniques plays a preponderant role in dealing with m...
research
09/06/2021

Statistical Privacy Guarantees of Machine Learning Preprocessing Techniques

Differential privacy provides strong privacy guarantees for machine lear...
research
12/06/2022

Benchmarking AutoML algorithms on a collection of binary problems

Automated machine learning (AutoML) algorithms have grown in popularity ...
research
01/13/2018

Towards a more efficient representation of imputation operators in TPOT

Automated Machine Learning encompasses a set of meta-algorithms intended...
research
03/12/2019

Exploiting Reuse in Pipeline-Aware Hyperparameter Tuning

Hyperparameter tuning of multi-stage pipelines introduces a significant ...

Please sign up or login with your details

Forgot password? Click here to reset