Parallelizing Convergent Cross Mapping Using Apache Spark

05/02/2019
by   Bo Pu, et al.
0

Identifying the causal relationships between subjects or variables remains an important problem across various scientific fields. This is particularly important but challenging in complex systems, such as those involving human behavior, sociotechnical contexts, and natural ecosystems. By exploiting state space reconstruction via lagged embedding of time series, convergent cross mapping (CCM) serves as an important method for addressing this problem. While powerful, CCM is computationally costly; moreover, CCM results are highly sensitive to several parameter values. While best practice entails exploring a range of parameter settings when assessing casual relationships, the resulting computational burden can raise barriers to practical use, especially for long time series exhibiting weak causal linkages. We demonstrate here several means of accelerating CCM by harnessing the distributed Apache Spark platform. We characterize and report on results of several experiments with parallelized solutions that demonstrate high scalability and a capacity for over an order of magnitude performance improvement for the baseline configuration. Such economies in computation time can speed learning and robust identification of causal drivers in complex systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2023

Causal discovery for time series from multiple datasets with latent contexts

Causal discovery from time series data is a typical problem setting acro...
research
05/26/2019

Causal Discovery and Forecasting in Nonstationary Environments with State-Space Models

In many scientific fields, such as economics and neuroscience, we are of...
research
10/23/2021

Path Signature Area-Based Causal Discovery in Coupled Time Series

Coupled dynamical systems are frequently observed in nature, but often n...
research
09/07/2022

Causal discovery for time series with latent confounders

Reconstructing the causal relationships behind the phenomena we observe ...
research
05/05/2023

Causal Discovery with Stage Variables for Health Time Series

Using observational data to learn causal relationships is essential when...
research
09/05/2023

Causal Structure Recovery of Linear Dynamical Systems: An FFT based Approach

Learning causal effects from data is a fundamental and well-studied prob...
research
04/16/2021

Shadow-Mapping for Unsupervised Neural Causal Discovery

An important goal across most scientific fields is the discovery of caus...

Please sign up or login with your details

Forgot password? Click here to reset