Exploring Opportunistic Meta-knowledge to Reduce Search Spaces for Automated Machine Learning

05/01/2021
by   Tien-Dung Nguyen, et al.
0

Machine learning (ML) pipeline composition and optimisation have been studied to seek multi-stage ML models, i.e. preprocessor-inclusive, that are both valid and well-performing. These processes typically require the design and traversal of complex configuration spaces consisting of not just individual ML components and their hyperparameters, but also higher-level pipeline structures that link these components together. Optimisation efficiency and resulting ML-model accuracy both suffer if this pipeline search space is unwieldy and excessively large; it becomes an appealing notion to avoid costly evaluations of poorly performing ML components ahead of time. Accordingly, this paper investigates whether, based on previous experience, a pool of available classifiers/regressors can be preemptively culled ahead of initiating a pipeline composition/optimisation process for a new ML problem, i.e. dataset. The previous experience comes in the form of classifier/regressor accuracy rankings derived, with loose assumptions, from a substantial but non-exhaustive number of pipeline evaluations; this meta-knowledge is considered 'opportunistic'. Numerous experiments with the AutoWeka4MCPS package, including ones leveraging similarities between datasets via the relative landmarking method, show that, despite its seeming unreliability, opportunistic meta-knowledge can improve ML outcomes. However, results also indicate that the culling of classifiers/regressors should not be too severe either. In effect, it is better to search through a 'top tier' of recommended predictors than to pin hopes onto one previously supreme performer.

READ FULL TEXT

page 1

page 7

research
08/08/2022

On Taking Advantage of Opportunistic Meta-knowledge to Reduce Configuration Spaces for Automated Machine Learning

The automated machine learning (AutoML) process can require searching th...
research
01/30/2020

AVATAR – Machine Learning Pipeline Evaluation Using Surrogate Model

The evaluation of machine learning (ML) pipelines is essential during au...
research
12/23/2019

AutoML: Exploration v.s. Exploitation

Building a machine learning (ML) pipeline in an automated way is a cruci...
research
11/21/2020

AutoWeka4MCPS-AVATAR: Accelerating Automated Machine Learning Pipeline Composition and Optimisation

Automated machine learning pipeline (ML) composition and optimisation ai...
research
03/19/2023

AutoEn: An AutoML method based on ensembles of predefined Machine Learning pipelines for supervised Traffic Forecasting

Intelligent Transportation Systems are producing tons of hardly manageab...
research
12/23/2020

AutonoML: Towards an Integrated Framework for Autonomous Machine Learning

Over the last decade, the long-running endeavour to automate high-level ...
research
06/07/2020

Efficient AutoML Pipeline Search with Matrix and Tensor Factorization

Data scientists seeking a good supervised learning model on a new datase...

Please sign up or login with your details

Forgot password? Click here to reset