Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical Evolution

04/01/2020
by   Filipe Assunção, et al.
0

The deployment of Machine Learning (ML) models is a difficult and time-consuming job that comprises a series of sequential and correlated tasks that go from the data pre-processing, and the design and extraction of features, to the choice of the ML algorithm and its parameterisation. The task is even more challenging considering that the design of features is in many cases problem specific, and thus requires domain-expertise. To overcome these limitations Automated Machine Learning (AutoML) methods seek to automate, with few or no human-intervention, the design of pipelines, i.e., automate the selection of the sequence of methods that have to be applied to the raw data. These methods have the potential to enable non-expert users to use ML, and provide expert users with solutions that they would unlikely consider. In particular, this paper describes AutoML-DSGE - a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution (DSGE) to the evolution of Scikit-Learn classification pipelines. The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient ClassificationPipeline Evolution (RECIPE), and show that the average performance of the classification pipelines generated by AutoML-DSGE is always superior to the average performance of RECIPE; the differences are statistically significant in 3 out of the 10 used datasets.

READ FULL TEXT

page 12

page 13

research
06/11/2019

Toward Best Practices for Explainable B2B Machine Learning

To design tools and data pipelines for explainable B2B machine learning ...
research
08/15/2019

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

There has been considerable growth and interest in industrial applicatio...
research
07/15/2022

Modeling Quality and Machine Learning Pipelines through Extended Feature Models

The recently increased complexity of Machine Learning (ML) methods, led ...
research
09/16/2021

A Comparative Study of Machine Learning Methods for Predicting the Evolution of Brain Connectivity from a Baseline Timepoint

Predicting the evolution of the brain network, also called connectome, b...
research
06/02/2021

Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins

Structured data, or data that adheres to a pre-defined schema, can suffe...
research
05/21/2022

Probabilistic Structured Grammatical Evolution

The grammars used in grammar-based Genetic Programming (GP) methods have...
research
01/28/2020

An Adaptive and Near Parameter-free Evolutionary Computation Approach Towards True Automation in AutoML

A common claim of evolutionary computation methods is that they can achi...

Please sign up or login with your details

Forgot password? Click here to reset