Understanding and taming SSD read performance variability: HDFS case study

03/22/2019
by   María F. Borge, et al.
0

In this paper we analyze the influence that lower layers (file system, OS, SSD) have on HDFS' ability to extract maximum performance from SSDs on the read path. We uncover and analyze three surprising performance slowdowns induced by lower layers that result in HDFS read throughput loss. First, intrinsic slowdown affects reads from every new file system extent for a variable amount of time. Second, temporal slowdown appears temporarily and periodically and is workload-agnostic. Third, in permanent slowdown, some files can individually and permanently become slower after a period of time. We analyze the impact of these slowdowns on HDFS and show significant throughput loss. Individually, each of the slowdowns can cause a read throughput loss of 10-15 their effect is cumulative. When all slowdowns happen concurrently, read throughput drops by as much as 30 and show that two of the three slowdowns could be addressed via increased IO request parallelism in the lower layers. Unfortunately, HDFS cannot automatically adapt to use such additional parallelism. Our results point to a need for adaptability in storage stacks. The reason is that an access pattern that maximizes performance in the common case is not necessarily the same one that can mask performance fluctuations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2017

Optimizing ROOT IO For Analysis

The ROOT I/O (RIO) subsystem is foundational to most HEP experiments - i...
research
05/11/2022

Studying Scientific Data Lifecycle in On-demand Distributed Storage Caches

The XRootD system is used to transfer, store, and cache large datasets f...
research
04/26/2015

Evaluating Dynamic File Striping For Lustre

We define dynamic striping as the ability to assign different Lustre str...
research
06/10/2020

A GPU Register File using Static Data Compression

GPUs rely on large register files to unlock thread-level parallelism for...
research
09/23/2021

Opportunistic Spectrum Access: Does Maximizing Throughput Minimize File Transfer Time?

The Opportunistic Spectrum Access (OSA) model has been developed for the...
research
04/10/2019

KEY-SSD: Access-Control Drive to Protect Files from Ransomware Attacks

Traditional techniques to prevent damage from ransomware attacks are to ...
research
12/28/2018

Task Elimination may Actually Increase Throughput Time

The well-known Task Elimination redesign principle suggests to remove un...

Please sign up or login with your details

Forgot password? Click here to reset