FT-LADS: Fault-Tolerant Object-Logging based Big Data Transfer System using Layout-Aware Data Scheduling

05/16/2018
by   Preethika Kasu, et al.
0

Layout-Aware Data Scheduler (LADS) data transfer tool, identifies and addresses the issues that lead to congestion on the path of an end-to-end data transfer in the terabit network environments. It exploits the underlying storage layout at each endpoint to maximize throughput without negatively impacting the performance of shared storage resources for other users. LADS can avoid congested storage elements within the shared storage resource, improving input/output bandwidth, and hence the data transfer rates across the high speed networks. However, absence of FT (fault tolerance) support in LADS results in data re-transmission overhead along with the possible integrity issues upon errors. In this paper, we propose object based logging methods to avoid transmitting the objects which are successfully written to Parallel File System (PFS) at the sink end. Depending on the number of logger files created, for the whole dataset, we classified our fault tolerance mechanisms into three different categories: File logger, Transaction logger and Universal logger. Also, to address space overhead of these object based logging mechanisms, we have proposed different methods of populating logger files with the information of the completed objects. We have evaluated the data transfer performance and recovery time overhead of the proposed object based logging fault tolerant mechanisms on LADS data transfer tool. Our experimental results show that, LADS in conjunction with proposed object based fault tolerance mechanisms exhibit an overhead of less than 1 time overhead is around 10

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2019

Pangolin: A Fault-Tolerant Persistent Memory Programming Library

Non-volatile main memory (NVMM) allows programmers to build complex, per...
research
12/01/2017

DAOS for Extreme-scale Systems in Scientific Applications

Exascale I/O initiatives will require new and fully integrated I/O model...
research
05/01/2023

Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs

General Matrix Multiplication (GEMM) is a crucial algorithm for various ...
research
11/03/2018

Fast Integrity Verification for High-Speed File Transfers

The amount of data generated by scientific and commercial applications i...
research
01/11/2021

A Fault Tolerant Mechanism for Partitioning and Offloading Framework in Pervasive Environments

Application partitioning and code offloading are being researched extens...
research
04/30/2018

Improving Performance of Iterative Methods by Lossy Checkponting

Iterative methods are commonly used approaches to solve large, sparse li...
research
10/29/2019

Disaggregation and the Application

This paper examines disaggregated data center architectures from the per...

Please sign up or login with your details

Forgot password? Click here to reset