DeepAI AI Chat
Log In Sign Up

From NoSQL Accumulo to NewSQL Graphulo: Design and Utility of Graph Algorithms inside a BigTable Database

by   Dylan Hutchison, et al.
University of Washington

Google BigTable's scale-out design for distributed key-value storage inspired a generation of NoSQL databases. Recently the NewSQL paradigm emerged in response to analytic workloads that demand distributed computation local to data storage. Many such analytics take the form of graph algorithms, a trend that motivated the GraphBLAS initiative to standardize a set of matrix math kernels for building graph algorithms. In this article we show how it is possible to implement the GraphBLAS kernels in a BigTable database by presenting the design of Graphulo, a library for executing graph algorithms inside the Apache Accumulo database. We detail the Graphulo implementation of two graph algorithms and conduct experiments comparing their performance to two main-memory matrix math systems. Our results shed insight into the conditions that determine when executing a graph algorithm is faster inside a database versus an external system---in short, that memory requirements and relative I/O are critical factors.


page 1

page 2

page 3

page 4


System G Distributed Graph Database

Motivated by the need to extract knowledge and value from interconnected...

Distributed Triangle Counting in the Graphulo Matrix Math Library

Triangle counting is a key algorithm for large graph analysis. The Graph...

Slim Graph: Practical Lossy Graph Compression for Approximate Graph Processing, Storage, and Analytics

We propose Slim Graph: the first programming model and framework for pra...

An introduction to Graph Data Management

A graph database is a database where the data structures for the schema ...

Benchmarking the Graphulo Processing Framework

Graph algorithms have wide applicablity to a variety of domains and are ...

Modeling Corruption in Eventually-Consistent Graph Databases

We present a model and analysis of an eventually consistent graph databa...

Distributed graphs: in search of fast, low-latency, resource-efficient, semantics-rich Big-Data processing

Large graphs can be processed with single high-memory or distributed sys...