Balancing Garbage Collection vs I/O Amplification using hybrid Key-Value Placement in LSM-based Key-Value Stores

06/07/2021
by   Giorgos Xanthakis, et al.
0

Key-value (KV) separation is a technique that introduces randomness in the I/O access patterns to reduce I/O amplification in LSM-based key-value stores for fast storage devices (NVMe). KV separation has a significant drawback that makes it less attractive: Delete and especially update operations that are important in modern workloads result in frequent and expensive garbage collection (GC) in the value log. In this paper, we design and implement Parallax, which proposes hybrid KV placement that reduces GC overhead significantly and maximizes the benefits of using a log. We first model the benefits of KV separation for different KV pair sizes. We use this model to classify KV pairs in three categories small, medium, and large. Then, Parallax uses different approaches for each KV category: It always places large values in a log and small values in place. For medium values it uses a mixed strategy that combines the benefits of using a log and eliminates GC overhead as follows: It places medium values in a log for all but the last few (typically one or two) levels in the LSM structure, where it performs a full compaction, merges values in place, and reclaims log space without the need for GC. We evaluate Parallax against RocksDB that places all values in place and BlobDB that always performs KV separation. We find that Parallax increases throughput by up to 12.4x and 17.83x, decreases I/O amplification by up to 27.1x and 26x, and increases CPU efficiency by up to 18.7x and 28x respectively, for all but scan-based YCSB workloads.

READ FULL TEXT

page 9

page 10

page 11

page 12

research
11/25/2018

Enabling Efficient Updates in KV Storage via Hashing: Design and Performance Evaluation

Persistent key-value (KV) stores mostly build on the Log-Structured Merg...
research
05/15/2022

Sibyl: Adaptive and Extensible Data Placement in Hybrid Storage Systems Using Online Reinforcement Learning

Hybrid storage systems (HSS) use multiple different storage devices to p...
research
03/24/2023

Honeycomb: ordered key-value store acceleration on an FPGA-based SmartNIC

In-memory ordered key-value stores are an important building block in mo...
research
10/19/2021

Using RDMA for Efficient Index Replication in LSM Key-Value Stores

Log-Structured Merge tree (LSM tree) Key-Value (KV) stores have become a...
research
02/28/2020

VAT: Asymptotic Cost Analysis for Multi-Level Key-Value Stores

Over the past years, there has been an increasing number of key-value (K...
research
04/07/2020

LUDA: Boost LSM Key Value Store Compactions with GPUs

Log-Structured-Merge (LSM) tree-based key value stores are facing critic...
research
10/17/2019

MV-PBT: Multi-Version Index for Large Datasets and HTAP Workloads

Modern mixed (HTAP) workloads execute fast update-transactions and long-...

Please sign up or login with your details

Forgot password? Click here to reset