Lethe: A Tunable Delete-Aware LSM Engine (Updated Version)

06/08/2020
by   Subhadeep Sarkar, et al.
0

Data-intensive applications fueled the evolution of log structured merge (LSM) based key-value engines that employ the out-of-place paradigm to support high ingestion rates with low read/write interference. These benefits, however, come at the cost of treating deletes as a second-class citizen. A delete inserts a tombstone that invalidates older instances of the deleted key. State-of-the-art LSM engines do not provide guarantees as to how fast a tombstone will propagate to persist the deletion. Further, LSM engines only support deletion on the sort key. To delete on another attribute (e.g., timestamp), the entire tree is read and re-written. We highlight that fast persistent deletion without affecting read performance is key to support: (i) streaming systems operating on a window of data, (ii) privacy with latency guarantees on the right-to-be-forgotten, and (iii) en masse cloud deployment of data systems that makes storage a precious resource. To address these challenges, in this paper, we build a new key-value storage engine, Lethe, that uses a very small amount of additional metadata, a set of new delete-aware compaction policies, and a new physical data layout that weaves the sort and the delete key order. We show that Lethe supports any user-defined threshold for the delete persistence latency offering higher read throughput (1.17-1.4×) and lower space amplification (2.1-9.8×), with a modest increase in write amplification (between 4% and 25%). In addition, Lethe supports efficient range deletes on a secondary delete key by dropping entire data pages without sacrificing read performance nor employing a costly full tree merge.

READ FULL TEXT

page 5

page 8

page 10

research
06/08/2020

Lethe: A Tunable Delete-Aware LSM Engine

Data-intensive applications fueled the evolution of log structured merge...
research
02/09/2022

Constructing and Analyzing the LSM Compaction Design Space (Updated Version)

Log-structured merge (LSM) trees offer efficient ingestion by appending ...
research
05/08/2023

Autumn: A Scalable Read Optimized LSM-tree based Key-Value Stores with Fast Point and Range Read Speed

The Log Structured Merge Trees (LSM-tree) based key-value stores are wid...
research
07/15/2021

Improving I/O Performance for Exascale Applications through Online Data Layout Reorganization

The applications being developed within the U.S. Exascale Computing Proj...
research
05/11/2020

ObjTables: structured spreadsheets that promote data quality, reuse, and integration

A central challenge in science is to understand how systems behaviors em...
research
08/05/2020

PrismDB: Read-aware Log-structured Merge Trees for Heterogeneous Storage

In recent years, emerging hardware storage technologies have focused on ...
research
02/08/2022

OSM-tree: A Sortedness-Aware Index

Indexes facilitate efficient querying when the selection predicate is on...

Please sign up or login with your details

Forgot password? Click here to reset