In Search of Optimal Data Placement for Eliminating Write Amplification in Log-Structured Storage

04/26/2021
by   Qiuping Wang, et al.
0

Log-structured storage has been widely deployed in various domains of storage systems for high performance. However, its garbage collection (GC) incurs severe write amplification (WA) due to the frequent rewrites of live data. This motivates many research studies, particularly on data placement strategies, that mitigate WA in log-structured storage. We show how to design an optimal data placement scheme that leads to the minimum WA with the future knowledge of block invalidation time (BIT) of each written block. Guided by this observation, we propose InferBIT, a novel data placement algorithm that aims to minimize WA in log-structured storage. Its core idea is to infer the BITs of written blocks from the underlying storage workloads, so as to place the blocks with similar estimated BITs into the same group in a fine-grained manner. We show via both mathematical and trace analyses that InferBIT can infer the BITs by leveraging the write skewness property in real-world storage workloads. Evaluation on block-level I/O traces from real-world cloud block storage workloads shows that InferBIT achieves the lowest WA compared to eight state-of-the-art data placement schemes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2022

Sibyl: Adaptive and Extensible Data Placement in Hybrid Storage Systems Using Online Reinforcement Learning

Hybrid storage systems (HSS) use multiple different storage devices to p...
research
03/21/2022

An In-Depth Comparative Analysis of Cloud Block Storage Workloads: Findings and Implications

Cloud block storage systems support diverse types of applications in mod...
research
07/23/2022

Improving the Reliability of Next Generation SSDs using WOM-v Codes

High density Solid State Drives, such as QLC drives, offer increased sto...
research
07/24/2018

Time-efficient Garbage Collection in SSDs

SSDs are currently replacing magnetic disks in many application areas. A...
research
03/08/2023

B-Treaps Revised: Write Efficient Randomized Block Search Trees with High Load

Uniquely represented data structures represent each logical state with a...
research
04/30/2020

Efficiently Reclaiming Space in a Log Structured Store

A log structured store uses a single write I/O for a number of diverse a...
research
01/22/2019

Adapting The Secretary Hiring Problem for Optimal Hot-Cold Tier Placement under Top-K Workloads

Top-K queries are an established heuristic in information retrieval. Thi...

Please sign up or login with your details

Forgot password? Click here to reset