AutoWeka4MCPS-AVATAR: Accelerating Automated Machine Learning Pipeline Composition and Optimisation

11/21/2020
by   Tien-Dung Nguyen, et al.
0

Automated machine learning pipeline (ML) composition and optimisation aim at automating the process of finding the most promising ML pipelines within allocated resources (i.e., time, CPU and memory). Existing methods, such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods frequently require a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid in the first place, and attempting to execute them is a waste of time and resources. To address this issue, we propose a novel method to evaluate the validity of ML pipelines, without their execution, using a surrogate model (AVATAR). The AVATAR generates a knowledge base by automatically learning the capabilities and effects of ML algorithms on datasets' characteristics. This knowledge base is used for a simplified mapping from an original ML pipeline to a surrogate model which is a Petri net based pipeline. Instead of executing the original ML pipeline to evaluate its validity, the AVATAR evaluates its surrogate model constructed by capabilities and effects of the ML pipeline components and input/output simplified mappings. Evaluating this surrogate model is less resource-intensive than the execution of the original pipeline. As a result, the AVATAR enables the pipeline composition and optimisation methods to evaluate more pipelines by quickly rejecting invalid pipelines. We integrate the AVATAR into the sequential model-based algorithm configuration (SMAC). Our experiments show that when SMAC employs AVATAR, it finds better solutions than on its own.

READ FULL TEXT

page 34

page 35

research
01/30/2020

AVATAR – Machine Learning Pipeline Evaluation Using Surrogate Model

The evaluation of machine learning (ML) pipelines is essential during au...
research
06/07/2022

SubStrat: A Subset-Based Strategy for Faster AutoML

Automated machine learning (AutoML) frameworks have become important too...
research
05/01/2021

Exploring Opportunistic Meta-knowledge to Reduce Search Spaces for Automated Machine Learning

Machine learning (ML) pipeline composition and optimisation have been st...
research
03/19/2023

AutoEn: An AutoML method based on ensembles of predefined Machine Learning pipelines for supervised Traffic Forecasting

Intelligent Transportation Systems are producing tons of hardly manageab...
research
11/10/2021

Towards Green Automated Machine Learning: Status Quo and Future Directions

Automated machine learning (AutoML) strives for the automatic configurat...
research
06/02/2023

Automating Pipelines of A/B Tests with Population Split Using Self-Adaptation and Machine Learning

A/B testing is a common approach used in industry to facilitate innovati...
research
08/08/2022

On Taking Advantage of Opportunistic Meta-knowledge to Reduce Configuration Spaces for Automated Machine Learning

The automated machine learning (AutoML) process can require searching th...

Please sign up or login with your details

Forgot password? Click here to reset