Tutorial: The Ubiquitous Skiplist, its Variants, and Applications in Modern Big Data Systems

The Skiplist, or skip list, originally designed as an in-memory data structure, has attracted a lot of attention in recent years as a main-memory component in many NoSQL, cloud-based, and big data systems. Unlike the B-tree, the skiplist does not need complex rebalancing mechanisms, but it still shows expected logarithmic performance. It supports a variety of operations, including insert, point read, and range queries. To make the skiplist more versatile, many optimizations have been applied to its node structure, construction algorithm, list structure, concurrent access, to name a few. Many variants of the skiplist have been proposed and experimented with, in many big-data system scenarios. In addition to being a main-memory component, the skiplist also serves as a core index in systems to address problems including write amplification, write stalls, sorting, range query processing, etc. In this tutorial, we present a comprehensive overview of the skiplist, its variants, optimizations, and various use cases of how big data and NoSQL systems make use of skiplists. Throughout this tutorial, we demonstrate the advantages of using a skiplist or skiplist-like structures in modern data systems.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset