Design and Development of a Java Parallel I/O Library

05/12/2023
by   Muhammad Sohaib Ayub, et al.
0

Parallel I/O refers to the ability of scientific programs to concurrently read/write from/to a single file from multiple processes executing on distributed memory platforms like compute clusters. In the HPC world, I/O becomes a significant bottleneck for many real-world scientific applications. In the last two decades, there has been significant research in improving the performance of I/O operations in scientific computing for traditional languages including C, C++, and Fortran. As a result of this, several mature and high-performance libraries including ROMIO (implementation of MPI-IO), parallel HDF5, Parallel I/O (PIO), and parallel netCDF are available today that provide efficient I/O for scientific applications. However, there is very little research done to evaluate and improve I/O performance of Java-based HPC applications. The main hindrance in the development of efficient parallel I/O Java libraries is the lack of a standard API (something equivalent to MPI-IO). Some adhoc solutions have been developed and used in proprietary applications, but there is no general-purpose solution that can be used by performance hungry applications. As part of this project, we plan to develop a Java-based parallel I/O API inspired by the MPI-IO bindings (MPI 2.0 standard document) for C, C++, and Fortran. Once the Java equivalent API of MPI-IO has been developed, we will develop a reference implementation on top of existing Java messaging libraries. Later, we will evaluate and compare performance of our reference Java Parallel I/O library with C/C++ counterparts using benchmarks and real-world applications.

READ FULL TEXT

page 1

page 16

page 18

page 31

page 32

research
10/20/2021

OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems

Python has become a dominant programming language for emerging areas lik...
research
07/06/2021

Toward Interlanguage Parallel Scripting for Distributed-Memory Scientific Computing

Scripting languages such as Python and R have been widely adopted as too...
research
07/13/2021

Transitioning from file-based HPC workflows to streaming data pipelines with openPMD and ADIOS2

This paper aims to create a transition path from file-based IO to stream...
research
06/28/2019

Parallel Performance of Molecular Dynamics Trajectory Analysis

The performance of biomolecular molecular dynamics (MD) simulations has ...
research
12/06/2017

Rings: an efficient Java/Scala library for polynomial rings

In this paper we briefly discuss Rings --- an efficient lightweight libr...
research
07/06/2023

Efficient parallel implementation of the multiplicative weight update method for graph-based linear programs

Positive linear programs (LPs) model many graph and operations research ...
research
02/22/2022

XtraLibD: Detecting Irrelevant Third-Party libraries in Java and Python Applications

Software development comprises the use of multiple Third-Party Libraries...

Please sign up or login with your details

Forgot password? Click here to reset