Nova-LSM: A Distributed, Component-based LSM-tree Key-value Store

04/03/2021
by   Haoyu Huang, et al.
0

The cloud infrastructure motivates disaggregation of monolithic data stores into components that are assembled together based on an application's workload. This study investigates disaggregation of an LSM-tree key-value store into components that communicate using RDMA. These components separate storage from processing, enabling processing components to share storage bandwidth and space. The processing components scatter blocks of a file (SSTable) across an arbitrary number of storage components and balance load across them using power-of-d. They construct ranges dynamically at runtime to parallelize compaction and enhance performance. Each component has configuration knobs that control its scalability. The resulting component-based system, Nova-LSM, is elastic. It outperforms its monolithic counterparts, both LevelDB and RocksDB, by several orders of magnitude with workloads that exhibit a skewed pattern of access to data.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

05/24/2018

GIRAF: General purpose In-storage Resistive Associative Framework

GIRAF is an in-storage architecture and algorithm framework based on Res...
09/01/2018

Eliminating Boundaries in Cloud Storage with Anna

In this paper, we describe how we extended a distributed key-value store...
10/26/2020

TurboKV: Scaling Up The Performance of Distributed Key-Value Stores With In-Switch Coordination

The power and flexibility of software-defined networks lead to a program...
02/21/2018

Managing and Querying Multi-versioned Documents using a Distributed Key-Value Store

We address the problem of compactly storing a large number of versions (...
08/27/2019

Performance modeling of a distributed file-system

Data centers have become center of big data processing. Most programs ru...
02/21/2018

RStore: A Distributed Multi-version Document Store

We address the problem of compactly storing a large number of versions (...
01/25/2021

Towards an Open Format for Scalable System Telemetry

A data representation for system behavior telemetry for scalable big dat...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.