Counting Causal Paths in Big Times Series Data on Networks

05/27/2019
by   Luka V. Petrovic, et al.
0

Graph or network representations are an important foundation for data mining and machine learning tasks in relational data. Many tools of network analysis, like centrality measures, information ranking, or cluster detection rest on the assumption that links capture direct influence, and that paths represent possible indirect influence. This assumption is invalidated in time-stamped network data capturing, e.g., dynamic social networks, biological sequences or financial transactions. In such data, for two time-stamped links (A,B) and (B,C) the chronological ordering and timing determines whether a causal path from node A via B to C exists. A number of works has shown that for that reason network analysis cannot be directly applied to time-stamped network data. Existing methods to address this issue require statistics on causal paths, which is computationally challenging for big data sets. Addressing this problem, we develop an efficient algorithm to count causal paths in time-stamped network data. Applying it to empirical data, we show that our method is more efficient than a baseline method implemented in an OpenSource data analytics package. Our method works efficiently for different values of the maximum time difference between consecutive links of a causal path and supports streaming scenarios. With it, we are closing a gap that hinders an efficient analysis of big time series data on complex networks.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

08/16/2019

Higher-Order Visualization of Causal Structures in Dynamics Graphs

Graph drawing and visualisation techniques are important tools for the e...
12/18/2019

Variable-lag Granger Causality for Time Series Analysis

Granger causality is a fundamental technique for causal inference in tim...
11/27/2019

LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data

We apply methods from randomized numerical linear algebra (RandNLA) to d...
03/16/2021

Rollage: Efficient Rolling Average Algorithm to Estimate ARMA Models for Big Time Series Data

We develop a new method to estimate an ARMA model in the presence of big...
06/24/2019

AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version

Recent development in computing, sensing and crowd-sourced data have res...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.