DeepAI AI Chat
Log In Sign Up

Learning-based Automatic Parameter Tuning for Big Data Analytics Frameworks

by   Liang Bao, et al.
University of California-Davis
Xidian University

Big data analytics frameworks (BDAFs) have been widely used for data processing applications. These frameworks provide a large number of configuration parameters to users, which leads to a tuning issue that overwhelms users. To address this issue, many automatic tuning approaches have been proposed. However, it remains a critical challenge to generate enough samples in a high-dimensional parameter space within a time constraint. In this paper, we present AutoTune--an automatic parameter tuning system that aims to optimize application execution time on BDAFs. AutoTune first constructs a smaller-scale testbed from the production system so that it can generate more samples, and thus train a better prediction model, under a given time constraint. Furthermore, the AutoTune algorithm produces a set of samples that can provide a wide coverage over the high-dimensional parameter space, and searches for more promising configurations using the trained prediction model. AutoTune is implemented and evaluated using the Spark framework and HiBench benchmark deployed on a public cloud. Extensive experimental results illustrate that AutoTune improves on default configurations by 63.70 the five state-of-the-art tuning algorithms by 6


page 2

page 3

page 5

page 6

page 7

page 9

page 10

page 12


Towards Interactive, Adaptive and Result-aware Big Data Analytics

As data volumes grow across applications, analytics of large amounts of ...

OneStopTuner: An End to End Architecture for JVM Tuning of Spark Applications

Java is the backbone of widely used big data frameworks, such as Apache ...

Tuneful: An Online Significance-Aware Configuration Tuner for Big Data Analytics

Distributed analytics engines such as Spark are a common choice for proc...

Auto Tuning of Hadoop and Spark parameters

Data of the order of terabytes, petabytes, or beyond is known as Big Dat...

Boosting Cloud Data Analytics using Multi-Objective Optimization

Data analytics in the cloud has become an integral part of enterprise bu...

On the Scalability of Big Data Cyber Security Analytics Systems

Big Data Cyber Security Analytics (BDCA) systems use big data technologi...

Two Dimensional Stochastic Configuration Networks for Image Data Analytics

Stochastic configuration networks (SCNs) as a class of randomized learne...