A unified framework to improve the interoperability between HPC and Big Data languages and programming models

12/01/2021
by   César Piñeiro, et al.
0

One of the most important issues in the path to the convergence of HPC and Big Data is caused by the differences in their software stacks. Despite some research efforts, the interoperability between their programming models and languages is still limited. To deal with this problem we introduce a new computing framework called IgnisHPC, whose main objective is to unify the execution of Big Data and HPC workloads in the same framework. IgnisHPC has native support for multi-language applications using JVM and non-JVM-based languages. Since MPI was used as its backbone technology, IgnisHPC takes advantage of many communication models and network architectures. Moreover, MPI applications can be directly executed in a efficient way in the framework. The main consequence is that users could combine in the same multi-language code HPC tasks (using MPI) with Big Data tasks (using MapReduce operations). The experimental evaluation demonstrates the benefits of our proposal in terms of performance and productivity with respect to other frameworks such as Apache Spark. IgnisHPC is publicly available for the Big Data and HPC research community.

READ FULL TEXT

page 4

page 7

page 9

page 12

research
12/28/2022

Hybrid Cloud and HPC Approach to High-Performance Dataframes

Data pre-processing is a fundamental component in any data-driven applic...
research
07/29/2019

Geospatial Big Data Handling with High Performance Computing: Current Approaches and Future Directions

Geospatial big data plays a major role in the era of big data, as most d...
research
01/10/2023

Exploring the Use of WebAssembly in HPC

Containerization approaches based on namespaces offered by the Linux ker...
research
09/28/2021

A Look at Communication-Intensive Performance in Julia

The Julia programming language continues to gain popularity both for its...
research
04/05/2018

Big enterprise registration data imputation: Supporting spatiotemporal analysis of industries in China

Big, fine-grained enterprise registration data that includes time and lo...
research
05/10/2020

fplyr: the split-apply-combine strategy for big data in R

We present fplyr, a new package for the R language to deal with big file...
research
07/22/2021

MPIs Language Bindings are Holding MPI Back

Over the past two decades, C++ has been adopted as a major HPC language ...

Please sign up or login with your details

Forgot password? Click here to reset