Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications

01/30/2018
by   Maaz Bin Safeer Ahmad, et al.
0

MapReduce is a popular programming paradigm for running large-scale data-intensive computation. Recently, many frameworks that implement that paradigm have been developed. To leverage such frameworks, however, developers need to familiarize with each framework's API and rewrite their code. We present CORA, a new tool that automatically translates sequential Java programs to the MapReduce paradigm. Rather than building a compiler by tediously designing pattern-matching rules to identify code fragments to translate from the input, CORA translates the input program in two steps: first, CORA uses program synthesis to identify input code fragments and search for a program summary (i.e., a functional specification) of each fragment. The summary is expressed using a high-level intermediate language resembling the MapReduce paradigm. Next, each found summary is verified to be semantically equivalent to the original using a theorem prover. CORA then generates executable code from the summary, using either the Hadoop, Spark, or Flink API. We have evaluated CORA by automatically converting real-world sequential Java benchmarks to MapReduce. The resulting benchmarks perform up to 32.2x faster compared to the original, and are all translated without designing any pattern-matching rules.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2021

ArCode: Facilitating the Use of Application Frameworks to Implement Tactics and Patterns

Software designers and developers are increasingly relying on applicatio...
research
08/15/2023

SEER: Super-Optimization Explorer for HLS using E-graph Rewriting with MLIR

High-level synthesis (HLS) is a process that automatically translates a ...
research
05/27/2018

A Formal Model of the Safety-Critical Java Level 2 Paradigm

Safety-Critical Java (SCJ) introduces a new programming paradigm for app...
research
07/07/2020

From API to NLI: A New Interface for Library Reuse

Developers frequently reuse APIs from existing libraries to implement ce...
research
03/11/2021

ArCode: A Tool for Supporting Comprehension andImplementation of Architectural Concerns

Integrated development environments (IDE) play an important role in supp...
research
05/16/2023

Experiences in Building a Composable and Functional API for Runtime SPIR-V Code Generation

This paper presents the Beehive SPIR-V Toolkit; a framework that can aut...
research
01/08/2019

A Journey Among Java Neutral Program Variants

Neutral program variants are functionally similar to an original program...

Please sign up or login with your details

Forgot password? Click here to reset