Survey the storage systems used in HPC and BDA ecosystems

12/22/2021
by   Priyam Shah, et al.
0

The advancement in HPC and BDA ecosystem demands a better understanding of the storage systems to plan effective solutions. To make applications access data more efficiently for computation, HPC and BDA ecosystems adopt different storage systems. Each storage system has its pros and cons. Therefore, it is worthwhile and interesting to explore the storage systems used in HPC and BDA respectively. Also, it's inquisitive to understand how such storage systems can handle data consistency and fault tolerance at a massive scale. In this paper, we're surveying four storage systems Lustre, Ceph, HDFS, and CockroachDB. Lustre and HDFS are some of the most prominent file systems in HPC and BDA ecosystem. Ceph is an upcoming filesystem and is being used by supercomputers. CockroachDB is based on NewSQL systems a technique that is being used in the industry for BDA applications. The study helps us to understand the underlying architecture of these storage systems and the building blocks used to create them. The protocols and mechanisms used for data storage, data access, data consistency, fault tolerance, and recovery from failover are also overviewed. The comparative study will help system designers to understand the key features and architectural goals of these storage systems to select better storage system solutions.

READ FULL TEXT
research
01/20/2020

BAASH: Enabling Blockchain-as-a-Service on High-Performance Computing Systems

The state-of-the-art approach to manage blockchains is to process blocks...
research
12/01/2017

DAOS for Extreme-scale Systems in Scientific Applications

Exascale I/O initiatives will require new and fully integrated I/O model...
research
05/27/2021

Characterizing Impacts of Storage Faults on HPC Applications: A Methodology and Insights

In recent years, the increasing complexity in scientific simulations and...
research
11/27/2019

Dynamically Provisioning Cray DataWarp Storage

Complex applications and workflows needs are often exclusively expressed...
research
05/16/2018

Client-side Straggler-Aware I/O Scheduler for Object-based Parallel File Systems

Object-based parallel file systems have emerged as promising storage sol...
research
02/13/2022

Towards Decentralised Cloud Storage with IPFS: Opportunities, Challenges, and Future Directions

The InterPlanetary File System (IPFS) is a novel decentralised storage a...
research
10/24/2017

High-Performance Code Generation though Fusion and Vectorization

We present a technique for automatically transforming kernel-based compu...

Please sign up or login with your details

Forgot password? Click here to reset