Learning-based Automatic Parameter Tuning for Big Data Analytics Frameworks

08/17/2018
by   Liang Bao, et al.
0

Big data analytics frameworks (BDAFs) have been widely used for data processing applications. These frameworks provide a large number of configuration parameters to users, which leads to a tuning issue that overwhelms users. To address this issue, many automatic tuning approaches have been proposed. However, it remains a critical challenge to generate enough samples in a high-dimensional parameter space within a time constraint. In this paper, we present AutoTune--an automatic parameter tuning system that aims to optimize application execution time on BDAFs. AutoTune first constructs a smaller-scale testbed from the production system so that it can generate more samples, and thus train a better prediction model, under a given time constraint. Furthermore, the AutoTune algorithm produces a set of samples that can provide a wide coverage over the high-dimensional parameter space, and searches for more promising configurations using the trained prediction model. AutoTune is implemented and evaluated using the Spark framework and HiBench benchmark deployed on a public cloud. Extensive experimental results illustrate that AutoTune improves on default configurations by 63.70 the five state-of-the-art tuning algorithms by 6

READ FULL TEXT

page 2

page 3

page 5

page 6

page 7

page 9

page 10

page 12

research
12/14/2022

Towards Interactive, Adaptive and Result-aware Big Data Analytics

As data volumes grow across applications, analytics of large amounts of ...
research
09/07/2020

OneStopTuner: An End to End Architecture for JVM Tuning of Spark Applications

Java is the backbone of widely used big data frameworks, such as Apache ...
research
01/22/2020

Tuneful: An Online Significance-Aware Configuration Tuner for Big Data Analytics

Distributed analytics engines such as Spark are a common choice for proc...
research
08/22/2023

Karasu: A Collaborative Approach to Efficient Cluster Configuration for Big Data Analytics

Selecting the right resources for big data analytics jobs is hard becaus...
research
05/07/2020

Boosting Cloud Data Analytics using Multi-Objective Optimization

Data analytics in the cloud has become an integral part of enterprise bu...
research
11/28/2021

On the Scalability of Big Data Cyber Security Analytics Systems

Big Data Cyber Security Analytics (BDCA) systems use big data technologi...
research
05/23/2020

Benchmarking and Performance Modelling of MapReduce Communication Pattern

Understanding and predicting the performance of big data applications ru...

Please sign up or login with your details

Forgot password? Click here to reset