SODA: A Semantics-Aware Optimization Framework for Data-Intensive Applications Using Hybrid Program Analysis

07/24/2021
by   Bingbing Rao, et al.
0

In the era of data explosion, a growing number of data-intensive computing frameworks, such as Apache Hadoop and Spark, have been proposed to handle the massive volume of unstructured data in parallel. Since programming models provided by these frameworks allow users to specify complex and diversified user-defined functions (UDFs) with predefined operations, the grand challenge of tuning up entire system performance arises if programmers do not fully understand the semantics of code, data, and runtime systems. In this paper, we design a holistic semantics-aware optimization for data-intensive applications using hybrid program analysis (SODA) to assist programmers to tune performance issues. SODA is a two-phase framework: the offline phase is a static analysis that analyzes code and performance profiling data from the online phase of prior executions to generate a parameterized and instrumented application; the online phase is a dynamic analysis that keeps track of the application's execution and collects runtime information of data and system. Extensive experimental results on four real-world Spark applications show that SODA can gain up to 60 three proposed optimization strategies, i.e., cache management, operation reordering, and element pruning, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/24/2021

A Survey of Semantics-Aware Performance Optimization for Data-Intensive Computing

We are living in the era of Big Data and witnessing the explosion of dat...
research
04/27/2023

Compiler Auto-tuning through Multiple Phase Learning

Widely used compilers like GCC and LLVM usually have hundreds of optimiz...
research
09/10/2021

A Precise Program Phase Identification Method Based on Frequency Domain Analysis

In this paper, we present a systematic approach that transforms the prog...
research
08/19/2018

BinMatch: A Semantics-based Hybrid Approach on Binary Code Clone Analysis

Binary code clone analysis is an important technique which has a wide ra...
research
07/15/2021

Deriving Static Security Testing from Runtime Security Protection for Web Applications

Context: Static Application Security Testing (SAST) and Runtime Applicat...
research
05/17/2019

Keeping Track of User Steering Actions in Dynamic Workflows

In long-lasting scientific workflow executions in HPC machines, computat...
research
10/05/2021

Online Application Guidance for Heterogeneous Memory Systems

Many high end and next generation computing systems to incorporated alte...

Please sign up or login with your details

Forgot password? Click here to reset