Approximate Integration of streaming data

09/13/2017
by   Michel de Rougemont, et al.
0

We approximate analytic queries on streaming data with a weighted reservoir sampling. For a stream of tuples of a Datawarehouse we show how to approximate some OLAP queries. For a stream of graph edges from a Social Network, we approximate the communities as the large connected components of the edges in the reservoir. We show that for a model of random graphs which follow a power law degree distribution, the community detection algorithm is a good approximation. Given two streams of graph edges from two Sources, we define the Community Correlation as the fraction of the nodes in communities in both streams. Although we do not store the edges of the streams, we can approximate the Community Correlation and define the Integration of two streams. We illustrate this approach with Twitter streams, associated with TV programs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/24/2018

The content correlation of multiple streaming edges

We study how to detect clusters in a graph defined by a stream of edges,...
research
10/15/2020

Large Very Dense Subgraphs in a Stream of Edges

We study the detection and the reconstruction of a large very dense subg...
research
03/02/2022

Pattern Recognition and Event Detection on IoT Data-streams

Big data streams are possibly one of the most essential underlying notio...
research
11/30/2021

Connected Components for Infinite Graph Streams: Theory and Practice

Motivated by the properties of unending real-world cybersecurity streams...
research
12/22/2017

Estimating Node Similarity by Sampling Streaming Bipartite Graphs

Bipartite graph data increasingly occurs as a stream of edges that repre...
research
10/24/2019

Communication-Efficient (Weighted) Reservoir Sampling

We consider communication-efficient weighted and unweighted (uniform) ra...
research
08/19/2018

An incremental local-first community detection method for dynamic graphs

Community detections for large-scale real world networks have been more ...

Please sign up or login with your details

Forgot password? Click here to reset