Building Near-Real-Time Processing Pipelines with the Spark-MPI Platform

05/13/2018
by   Nikolay Malitsky, et al.
0

Advances in detectors and computational technologies provide new opportunities for applied research and the fundamental sciences. Concurrently, dramatic increases in the three Vs (Volume, Velocity, and Variety) of experimental data and the scale of computational tasks produced the demand for new real-time processing systems at experimental facilities. Recently, this demand was addressed by the Spark-MPI approach connecting the Spark data-intensive platform with the MPI high-performance framework. In contrast with existing data management and analytics systems, Spark introduced a new middleware based on resilient distributed datasets (RDDs), which decoupled various data sources from high-level processing algorithms. The RDD middleware significantly advanced the scope of data-intensive applications, spreading from SQL queries to machine learning to graph processing. Spark-MPI further extended the Spark ecosystem with the MPI applications using the Process Management Interface. The paper explores this integrated platform within the context of online ptychographic and tomographic reconstruction pipelines.

READ FULL TEXT

page 5

page 6

page 7

research
05/16/2018

Spark-MPI: Approaching the Fifth Paradigm of Cognitive Applications

Over the past decade, the fourth paradigm of data-intensive science rapi...
research
11/12/2018

Comparing Spark vs MPI/OpenMP On Word Count MapReduce

Spark provides an in-memory implementation of MapReduce that is widely u...
research
11/08/2022

Designing an Adaptive Application-Level Checkpoint Management System for Malleable MPI Applications

Dynamic resource management opens up numerous opportunities in High Perf...
research
03/22/2019

Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach

Computationally-intensive loops are the primary source of parallelism in...
research
02/14/2020

Big Data Staging with MPI-IO for Interactive X-ray Science

New techniques in X-ray scattering science experiments produce large dat...
research
05/31/2023

A Survey of Potential MPI Complex Collectives: Large-Scale Mining and Analysis of HPC Applications

Offload of MPI collectives to network devices, e.g., NICs and switches, ...
research
03/06/2023

Fault Awareness in the MPI 4.0 Session Model

The latest version of MPI introduces new functionalities like the Sessio...

Please sign up or login with your details

Forgot password? Click here to reset