HAMLET – A Learning Curve-Enabled Multi-Armed Bandit for Algorithm Selection

01/30/2020
by   Mischa Schmidt, et al.
0

Automated algorithm selection and hyperparameter tuning facilitates the application of machine learning. Traditional multi-armed bandit strategies look to the history of observed rewards to identify the most promising arms for optimizing expected total reward in the long run. When considering limited time budgets and computational resources, this backward view of rewards is inappropriate as the bandit should look into the future for anticipating the highest final reward at the end of a specified time budget. This work addresses that insight by introducing HAMLET, which extends the bandit approach with learning curve extrapolation and computation time-awareness for selecting among a set of machine learning algorithms. Results show that the HAMLET Variants 1-3 exhibit equal or better performance than other bandit-based algorithm selection strategies in experiments with recorded hyperparameter tuning traces for the majority of considered time budgets. The best performing HAMLET Variant 3 combines learning curve extrapolation with the well-known upper confidence bound exploration bonus. That variant performs better than all non-HAMLET policies with statistical significance at the 95

READ FULL TEXT
research
09/23/2020

EXP4-DFDC: A Non-Stochastic Multi-Armed Bandit for Cache Replacement

In this work we study a variant of the well-known multi-armed bandit (MA...
research
02/26/2019

Perturbed-History Exploration in Stochastic Multi-Armed Bandits

We propose an online algorithm for cumulative regret minimization in a s...
research
01/30/2023

Evaluating COVID-19 vaccine allocation policies using Bayesian m-top exploration

Individual-based epidemiological models support the study of fine-graine...
research
01/24/2019

The Assistive Multi-Armed Bandit

Learning preferences implicit in the choices humans make is a well studi...
research
12/23/2015

The Max K-Armed Bandit: PAC Lower Bounds and Efficient Algorithms

We consider the Max K-Armed Bandit problem, where a learning agent is fa...
research
12/30/2021

Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

Most algorithms for the multi-armed bandit problem in reinforcement lear...
research
12/16/2022

Materials Discovery using Max K-Armed Bandit

Search algorithms for the bandit problems are applicable in materials di...

Please sign up or login with your details

Forgot password? Click here to reset