DisTRaC: Accelerating High Performance Compute Processing for Temporary Data Storage

12/06/2022
by   Gabryel Mason-Williams, et al.
0

High Performance Compute (HPC) clusters often produce intermediate files as part of code execution and message passing is not always possible to supply data to these cluster jobs. In these cases, I/O goes back to central distributed storage to allow cross node data sharing. These systems are often high performance and characterised by their high cost per TB and sensitivity to workload type such as being tuned to small or large file I/O. However, compute nodes often have large amounts of RAM, so when dealing with intermediate files where longevity or reliability of the system is not as important, local RAM disks can be used to obtain performance benefits. In this paper we show how this problem was tackled by creating a RAM block that could interact with the object storage system Ceph, as well as creating a deployment tool to deploy Ceph on HPC infrastructure effectively. This work resulted in a system that was more performant than the central high performance distributed storage system used at Diamond reducing I/O overhead and processing time for Savu, a tomography data processing application, by 81.04

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2022

Performance Comparison of DAOS and Lustre for Object Data Storage Approaches

High-performance object stores are an emerging technology which offers a...
research
12/09/2017

Code Generation Techniques for Raw Data Processing

The motivation of the current study was to design an algorithm that can ...
research
12/20/2016

CannyFS: Opportunistically Maximizing I/O Throughput Exploiting the Transactional Nature of Batch-Mode Data Processing

We introduce a user mode file system, CannyFS, that hides latency by ass...
research
09/03/2019

Large Scale Parallelization Using File-Based Communications

In this paper, we present a novel and new file-based communication archi...
research
01/01/2020

AIR – A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing

Distributed Stream Processing Systems (DSPSs) are among the currently mo...
research
12/04/2021

Towards Aggregated Asynchronous Checkpointing

High-Performance Computing (HPC) applications need to checkpoint massive...
research
01/31/2022

Fragmented ARES: Dynamic Storage for Large Objects

Data availability is one of the most important features in distributed s...

Please sign up or login with your details

Forgot password? Click here to reset