A Comparative Study of Consistent Snapshot Algorithms for Main-Memory Database Systems

10/11/2018
by   Liang Li, et al.
0

In-memory databases (IMDBs) are gaining increasing popularity in big data applications, where clients commit updates intensively. Specifically, it is necessary for IMDBs to have efficient snapshot performance to support certain special applications (e.g., consistent checkpoint, HTAP). Formally, the in-memory consistent snapshot problem refers to taking an in-memory consistent time-in-point snapshot with the constraints that 1) clients can read the latest data items and 2) any data item in the snapshot should not be overwritten. Various snapshot algorithms have been proposed in academia to trade off throughput and latency, but industrial IMDBs such as Redis adhere to the simple fork algorithm. To understand this phenomenon, we conduct comprehensive performance evaluations on mainstream snapshot algorithms. Surprisingly, we observe that the simple fork algorithm indeed outperforms the state-of-the-arts in update-intensive workload scenarios. On this basis, we identify the drawbacks of existing research and propose two lightweight improvements. Extensive evaluations on synthetic data and Redis show that our lightweight improvements yield better performance than fork, the current industrial standard, and the representative snapshot algorithms from academia. Finally, we have opensourced the implementation of all the above snapshot algorithms so that practitioners are able to benchmark the performance of each algorithm and select proper methods for different application scenarios.

READ FULL TEXT
research
04/27/2022

Memory-Disaggregated In-Memory Object Store Framework for Big Data Applications

The concept of memory disaggregation has recently been gaining traction ...
research
04/24/2022

Enabling High-Performance and Energy-Efficient Hybrid Transactional/Analytical Databases with Hardware/Software Cooperation

A growth in data volume, combined with increasing demand for real-time a...
research
07/05/2018

A Comparative Study of Containers and Virtual Machines in Big Data Environment

Container technique is gaining increasing attention in recent years and ...
research
01/24/2019

Benchmark Time Series Database with IoTDB-Benchmark for IoT Scenarios

With the wide application of time series databases (TSDB) in big data fi...
research
03/01/2021

Polynesia: Enabling Effective Hybrid Transactional/Analytical Databases with Specialized Hardware/Software Co-Design

An exponential growth in data volume, combined with increasing demand fo...
research
03/06/2023

Towards Capacity-Aware Broker Matching: From Recommendation to Assignment

Online real estate platforms are gaining increasing popularity, where a ...
research
06/16/2023

MementoHash: A Stateful, Minimal Memory, Best Performing Consistent Hash Algorithm

Consistent hashing is used in distributed systems and networking applica...

Please sign up or login with your details

Forgot password? Click here to reset