StreamBox-HBM: Stream Analytics on High Bandwidth Hybrid Memory

by   Hongyu Miao, et al.

Stream analytics have an insatiable demand for memory and performance. Emerging hybrid memories combine commodity DDR4 DRAM with 3D-stacked High Bandwidth Memory (HBM) DRAM to meet such demands. However, achieving this promise is challenging because (1) HBM is capacity-limited and (2) HBM boosts performance best for sequential access and high parallelism workloads. At first glance, stream analytics appear a particularly poor match for HBM because they have high capacity demands and data grouping operations, their most demanding computations, use random access. This paper presents the design and implementation of StreamBox-HBM, a stream analytics engine that exploits hybrid memories to achieve scalable high performance. StreamBox-HBM performs data grouping with sequential access sorting algorithms in HBM, in contrast to random access hashing algorithms commonly used in DRAM. StreamBox-HBM solely uses HBM to store Key Pointer Array (KPA) data structures that contain only partial records (keys and pointers to full records) for grouping operations. It dynamically creates and manages prodigious data and pipeline parallelism, choosing when to allocate KPAs in HBM. It dynamically optimizes for both the high bandwidth and limited capacity of HBM, and the limited bandwidth and high capacity of standard DRAM. StreamBox-HBM achieves 110 million records per second and 238 GB/s memory bandwidth while effectively utilizing all 64 cores of Intel's Knights Landing, a commercial server with hybrid memory. It outperforms stream engines with sequential access algorithms without KPAs by 7x and stream engines with random access algorithms by an order of magnitude in throughput. To the best of our knowledge, StreamBox-HBM is the first stream engine optimized for hybrid memories.


Analysis of Interference between RDMA and Local Access on Hybrid Memory System

We can use a hybrid memory system consisting of DRAM and Intel Optane DC...

BigSparse: High-performance external graph analytics

We present BigSparse, a fully external graph analytics system that picks...

Monarch: A Durable Polymorphic Memory For Data Intensive Applications

3D die stacking has often been proposed to build large-scale DRAM-based ...

Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM

Bitwise operations are an important component of modern day programming....

StreamBox-TZ: A Secure IoT Analytics Engine at the Edge

We present StreamBox-TZ, a stream analytics engine for an edge platform....

Semi-Asymmetric Parallel Graph Algorithms for NVRAMs

Emerging non-volatile main memory (NVRAM) technologies provide novel fea...

DimmWitted: A Study of Main-Memory Statistical Analytics

We perform the first study of the tradeoff space of access methods and r...

Please sign up or login with your details

Forgot password? Click here to reset