DeepAI AI Chat
Log In Sign Up

A Billion Updates per Second Using 30,000 Hierarchical In-Memory D4M Databases

02/03/2019
by   Jeremy Kepner, et al.
0

Analyzing large scale networks requires high performance streaming updates of graph representations of these data. Associative arrays are mathematical objects combining properties of spreadsheets, databases, matrices, and graphs, and are well-suited for representing and analyzing streaming network data. The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in a variety of languages (Python, Julia, and Matlab/Octave) and provides a lightweight in-memory database. Associative arrays are designed for block updates. Streaming updates to a large associative array requires a hierarchical implementation to optimize the performance of the memory hierarchy. Running 34,000 instances of a hierarchical D4M associative arrays on 1,100 server nodes on the MIT SuperCloud achieved a sustained update rate of 1,900,000,000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets.

READ FULL TEXT

page 1

page 2

page 3

07/06/2019

Streaming 1.9 Billion Hypersparse Network Updates per Second with D4M

The Dynamic Distributed Dimensional Data Model (D4M) library implements ...
01/20/2020

75,000,000,000 Streaming Inserts/Second Using Hierarchical Hypersparse GraphBLAS Matrices

The SuiteSparse GraphBLAS C-library implements high performance hyperspa...
08/15/2021

Vertical, Temporal, and Horizontal Scaling of Hierarchical Hypersparse GraphBLAS Matrices

Hypersparse matrices are a powerful enabler for a variety of network, he...
08/14/2016

Julia Implementation of the Dynamic Distributed Dimensional Data Model

Julia is a new language for writing data analysis programs that are easy...
09/24/2018

Optimality of Linear Sketching under Modular Updates

We study the relation between streaming algorithms and linear sketching ...
01/18/2020

AI Data Wrangling with Associative Arrays

The AI revolution is data driven. AI "data wrangling" is the process by ...
11/19/2021

Improving a High Productivity Data Analytics Chapel Framework

Most state of the art exploratory data analysis frameworks fall into one...