DeepAI AI Chat
Log In Sign Up

Learning-based Automatic Parameter Tuning for Big Data Analytics Frameworks

08/17/2018
by   Liang Bao, et al.
University of California-Davis
Xidian University
0

Big data analytics frameworks (BDAFs) have been widely used for data processing applications. These frameworks provide a large number of configuration parameters to users, which leads to a tuning issue that overwhelms users. To address this issue, many automatic tuning approaches have been proposed. However, it remains a critical challenge to generate enough samples in a high-dimensional parameter space within a time constraint. In this paper, we present AutoTune--an automatic parameter tuning system that aims to optimize application execution time on BDAFs. AutoTune first constructs a smaller-scale testbed from the production system so that it can generate more samples, and thus train a better prediction model, under a given time constraint. Furthermore, the AutoTune algorithm produces a set of samples that can provide a wide coverage over the high-dimensional parameter space, and searches for more promising configurations using the trained prediction model. AutoTune is implemented and evaluated using the Spark framework and HiBench benchmark deployed on a public cloud. Extensive experimental results illustrate that AutoTune improves on default configurations by 63.70 the five state-of-the-art tuning algorithms by 6

READ FULL TEXT

page 2

page 3

page 5

page 6

page 7

page 9

page 10

page 12

12/14/2022

Towards Interactive, Adaptive and Result-aware Big Data Analytics

As data volumes grow across applications, analytics of large amounts of ...
09/07/2020

OneStopTuner: An End to End Architecture for JVM Tuning of Spark Applications

Java is the backbone of widely used big data frameworks, such as Apache ...
01/22/2020

Tuneful: An Online Significance-Aware Configuration Tuner for Big Data Analytics

Distributed analytics engines such as Spark are a common choice for proc...
11/04/2021

Auto Tuning of Hadoop and Spark parameters

Data of the order of terabytes, petabytes, or beyond is known as Big Dat...
05/07/2020

Boosting Cloud Data Analytics using Multi-Objective Optimization

Data analytics in the cloud has become an integral part of enterprise bu...
11/28/2021

On the Scalability of Big Data Cyber Security Analytics Systems

Big Data Cyber Security Analytics (BDCA) systems use big data technologi...
09/06/2018

Two Dimensional Stochastic Configuration Networks for Image Data Analytics

Stochastic configuration networks (SCNs) as a class of randomized learne...