Streaming Temporal Graphs: Subgraph Matching

04/01/2020
by   Eric L. Goodman, et al.
0

We investigate solutions to subgraph matching within a temporal stream of data. We present a high-level language for describing temporal subgraphs of interest, the Streaming Analytics Language (SAL). SAL programs are translated into C++ code that is run in parallel on a cluster. We call this implementation of SAL the Streaming Analytics Machine (SAM). SAL programs are succinct, requiring about 20 times fewer lines of code than using the SAM library directly, or writing an implementation using Apache Flink. To benchmark SAM we calculate finding temporal triangles within streaming netflow data. Also, we compare SAM to an implementation written for Flink. We find that SAM is able to scale to 128 nodes or 2560 cores, while Apache Flink has max throughput with 32 nodes and degrades thereafter. Apache Flink has an advantage when triangles are rare, with max aggregate throughput for Flink at 32 nodes greater than the max achievable rate of SAM. In our experiments, when triangle occurrence was faster than five per second per node, SAM performed better. Both frameworks may miss results due to latencies in network communication. SAM consistently reported an average of 93.7 as we increase to the maximum size of the cluster. Overall, SAM can obtain rates of 91.8 billion netflows per day.

READ FULL TEXT
research
11/03/2019

A Streaming Analytics Language for Processing Cyber Data

We present a domain-specific language called SAL(the Streaming Analytics...
research
01/24/2018

A Chronological Edge-Driven Approach to Temporal Subgraph Isomorphism

Many real world networks are considered temporal networks, in which the ...
research
01/28/2018

Time Constrained Continuous Subgraph Search over Streaming Graphs

The growing popularity of dynamic applications such as social networks p...
research
06/20/2022

Mnemonic: A Parallel Subgraph Matching System for Streaming Graphs

Finding patterns in large highly connected datasets is critical for valu...
research
07/31/2019

Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark

Real-world data from diverse domains require real-time scalable analysis...
research
09/13/2021

Maximum Matching sans Maximal Matching: A New Approach for Finding Maximum Matchings in the Data Stream Model

The problem of finding a maximum size matching in a graph (known as the ...
research
06/12/2016

Automated Space/Time Scaling of Streaming Task Graph

In this paper, we describe a high-level synthesis (HLS) tool that automa...

Please sign up or login with your details

Forgot password? Click here to reset