eTOP: Early Termination of Pipelines for Faster Training of AutoML Systems

04/17/2023
by   Haoxiang Zhang, et al.
0

Recent advancements in software and hardware technologies have enabled the use of AI/ML models in everyday applications has significantly improved the quality of service rendered. However, for a given application, finding the right AI/ML model is a complex and costly process, that involves the generation, training, and evaluation of multiple interlinked steps (called pipelines), such as data pre-processing, feature engineering, selection, and model tuning. These pipelines are complex (in structure) and costly (both in compute resource and time) to execute end-to-end, with a hyper-parameter associated with each step. AutoML systems automate the search of these hyper-parameters but are slow, as they rely on optimizing the pipeline's end output. We propose the eTOP Framework which works on top of any AutoML system and decides whether or not to execute the pipeline to the end or terminate at an intermediate step. Experimental evaluation on 26 benchmark datasets and integration of eTOPwith MLBox4 reduces the training time of the AutoML system upto 40x than baseline MLBox.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2022

DiffML: End-to-end Differentiable ML Pipelines

In this paper, we present our vision of differentiable ML pipelines call...
research
06/07/2022

SubStrat: A Subset-Based Strategy for Faster AutoML

Automated machine learning (AutoML) frameworks have become important too...
research
04/26/2019

AlphaClean: Automatic Generation of Data Cleaning Pipelines

The analyst effort in data cleaning is gradually shifting away from the ...
research
11/24/2018

MLModelScope: Evaluate and Measure ML Models within AI Pipelines

The current landscape of Machine Learning (ML) and Deep Learning (DL) is...
research
02/24/2021

Dataset Lifecycle Framework and its applications in Bioinformatics

Bioinformatics pipelines depend on shared POSIX filesystems for its inpu...
research
11/07/2021

Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines

Input pipelines, which ingest and transform input data, are an essential...
research
11/01/2022

Strategies for Optimizing End-to-End Artificial Intelligence Pipelines on Intel Xeon Processors

End-to-end (E2E) artificial intelligence (AI) pipelines are composed of ...

Please sign up or login with your details

Forgot password? Click here to reset