Progressive Sampling-Based Bayesian Optimization for Efficient and Automatic Machine Learning Model Selection

by   Xueqiang Zeng, et al.

Purpose: Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting algorithms and hyper-parameter values requires advanced machine learning knowledge and many labor-intensive manual iterations. To lower the bar to machine learning, miscellaneous automatic selection methods for algorithms and/or hyper-parameter values have been proposed. Existing automatic selection methods are inefficient on large data sets. This poses a challenge for using machine learning in the clinical big data era. Methods: To address the challenge, this paper presents progressive sampling-based Bayesian optimization, an efficient and automatic selection method for both algorithms and hyper-parameter values. Results: We report an implementation of the method. We show that compared to a state of the art automatic selection method, our method can significantly reduce search time, classification error rate, and standard deviation of error rate due to randomization. Conclusions: This is major progress towards enabling fast turnaround in identifying high-quality solutions required by many machine learning-based clinical data analysis tasks.



There are no comments yet.


page 1

page 2

page 3

page 4


Bayesian Optimization for Selecting Efficient Machine Learning Models

The performance of many machine learning models depends on their hyper-p...

Automatic tuning of hyper-parameters of reinforcement learning algorithms using Bayesian optimization with behavioral cloning

Optimal setting of several hyper-parameters in machine learning algorith...

ReinBo: Machine Learning pipeline search and configuration with Bayesian Optimization embedded Reinforcement Learning

Machine learning pipeline potentially consists of several stages of oper...

High Dimensional Restrictive Federated Model Selection with multi-objective Bayesian Optimization over shifted distributions

A novel machine learning optimization process coined Restrictive Federat...

Automated Machine Learning via ADMM

We study the automated machine learning (AutoML) problem of jointly sele...

Online Optimization of Stimulation Speed in an Auditory Brain-Computer Interface under Time Constraints

The decoding of brain signals recorded via, e.g., an electroencephalogra...

OneStopTuner: An End to End Architecture for JVM Tuning of Spark Applications

Java is the backbone of widely used big data frameworks, such as Apache ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.