Incremental Search Space Construction for Machine Learning Pipeline Synthesis

01/26/2021
by   Marc-André Zöller, et al.
0

Automated machine learning (AutoML) aims for constructing machine learning (ML) pipelines automatically. Many studies have investigated efficient methods for algorithm selection and hyperparameter optimization. However, methods for ML pipeline synthesis and optimization considering the impact of complex pipeline structures containing multiple preprocessing and classification algorithms have not been studied thoroughly. In this paper, we propose a data-centric approach based on meta-features for pipeline construction and hyperparameter optimization inspired by human behavior. By expanding the pipeline search space incrementally in combination with meta-features of intermediate data sets, we are able to prune the pipeline structure search space efficiently. Consequently, flexible and data set specific ML pipelines can be constructed. We prove the effectiveness and competitiveness of our approach on 28 data sets used in well-established AutoML benchmarks in comparison with state-of-the-art AutoML frameworks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2023

DiffPrep: Differentiable Data Preprocessing Pipeline Search for Learning over Tabular Data

Data preprocessing is a crucial step in the machine learning process tha...
research
02/18/2022

SapientML: Synthesizing Machine Learning Pipelines by Learning from Human-Written Solutions

Automatic machine learning, or AutoML, holds the promise of truly democr...
research
07/01/2019

Two-stage Optimization for Machine Learning Workflow

Machines learning techniques plays a preponderant role in dealing with m...
research
03/12/2019

Exploiting Reuse in Pipeline-Aware Hyperparameter Tuning

Hyperparameter tuning of multi-stage pipelines introduces a significant ...
research
03/19/2023

AutoEn: An AutoML method based on ensembles of predefined Machine Learning pipelines for supervised Traffic Forecasting

Intelligent Transportation Systems are producing tons of hardly manageab...
research
02/01/2023

Faster Convergence with Lexicase Selection in Tree-based Automated Machine Learning

In many evolutionary computation systems, parent selection methods can a...
research
02/28/2023

Towards Personalized Preprocessing Pipeline Search

Feature preprocessing, which transforms raw input features into numerica...

Please sign up or login with your details

Forgot password? Click here to reset