Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML

07/17/2023
by   Lennart Purucker, et al.
0

Automated machine learning (AutoML) systems commonly ensemble models post hoc to improve predictive performance, typically via greedy ensemble selection (GES). However, we believe that GES may not always be optimal, as it performs a simple deterministic greedy search. In this work, we introduce two novel population-based ensemble selection methods, QO-ES and QDO-ES, and compare them to GES. While QO-ES optimises solely for predictive performance, QDO-ES also considers the diversity of ensembles within the population, maintaining a diverse set of well-performing ensembles during optimisation based on ideas of quality diversity optimisation. The methods are evaluated using 71 classification datasets from the AutoML benchmark, demonstrating that QO-ES and QDO-ES often outrank GES, albeit only statistically significant on validation data. Our results further suggest that diversity can be beneficial for post hoc ensembling but also increases the risk of overfitting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2023

CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure

Many state-of-the-art automated machine learning (AutoML) systems use gr...
research
05/05/2018

Developing parsimonious ensembles using ensemble diversity within a reinforcement learning framework

Heterogeneous ensembles built from the predictions of a wide variety and...
research
02/15/2021

Developing parsimonious ensembles using predictor diversity within a reinforcement learning framework

Heterogeneous ensembles that can aggregate an unrestricted number and va...
research
06/06/2023

Bayesian post-hoc regularization of random forests

Random Forests are powerful ensemble learning algorithms widely used in ...
research
12/10/2020

Ensemble Squared: A Meta AutoML System

The continuing rise in the number of problems amenable to machine learni...
research
06/15/2018

The Limits of Post-Selection Generalization

While statistics and machine learning offers numerous methods for ensuri...
research
11/29/2021

Conceptually Diverse Base Model Selection for Meta-Learners in Concept Drifting Data Streams

Meta-learners and ensembles aim to combine a set of relevant yet diverse...

Please sign up or login with your details

Forgot password? Click here to reset