Improving I/O Performance for Exascale Applications through Online Data Layout Reorganization

07/15/2021
by   Lipeng Wan, et al.
0

The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent Exascale computers will generate scientific results with unprecedented fidelity and record turn-around time. Many of these codes are based on particle-mesh methods and use advanced algorithms, especially dynamic load-balancing and mesh-refinement, to achieve high performance on Exascale machines. Yet, as such algorithms improve parallel application efficiency, they raise new challenges for I/O logic due to their irregular and dynamic data distributions. Thus, while the enormous data rates of Exascale simulations already challenge existing file system write strategies, the need for efficient read and processing of generated data introduces additional constraints on the data layout strategies that can be used when writing data to secondary storage. We review these I/O challenges and introduce two online data layout reorganization approaches for achieving good tradeoffs between read and write performance. We demonstrate the benefits of using these two approaches for the ECP particle-in-cell simulation WarpX, which serves as a motif for a large class of important Exascale applications. We show that by understanding application I/O patterns and carefully designing data layouts we can increase read performance by more than 80

READ FULL TEXT

page 2

page 3

page 4

page 6

page 7

page 12

research
03/17/2020

Evolution of the ROOT Tree I/O

The ROOT TTree data format encodes hundreds of petabytes of High Energy ...
research
06/08/2020

Lethe: A Tunable Delete-Aware LSM Engine

Data-intensive applications fueled the evolution of log structured merge...
research
06/08/2020

Lethe: A Tunable Delete-Aware LSM Engine (Updated Version)

Data-intensive applications fueled the evolution of log structured merge...
research
08/03/2018

ViPIOS - VIenna Parallel Input Output System: Language, Compiler and Advanced Data Structure Support for Parallel I/O Operations

For an increasing number of data intensive scientific applications, para...
research
03/31/2017

A Domain-Specific Language and Editor for Parallel Particle Methods

Domain-specific languages (DSLs) are of increasing importance in scienti...
research
04/07/2022

Challenges in implementing DDR3 memory interface on PCB systems: a methodology for interfacing DDR3 SDRAM DIMM to an FPGA

Undoubtedly faster, larger and lower power per bit, but just how do you ...
research
05/11/2020

Structuring spreadsheets with ObjTables enables data quality control, reuse, and integration

A central challenge in science is to understand how systems behaviors em...

Please sign up or login with your details

Forgot password? Click here to reset