: Generating Realistic Production Data for Benchmarking Causal Discovery

06/19/2023
by   Konstantin Göbler, et al.
0

Algorithms for causal discovery have recently undergone rapid advances and increasingly draw on flexible nonparametric methods to process complex data. With these advances comes a need for adequate empirical validation of the causal relationships learned by different algorithms. However, for most real data sources true causal relations remain unknown. This issue is further compounded by privacy concerns surrounding the release of suitable high-quality data. To help address these challenges, we gather a complex dataset comprising measurements from an assembly line in a manufacturing context. This line consists of numerous physical processes for which we are able to provide ground truth causal relationships on the basis of a detailed study of the underlying physics. We use the assembly line data and associated ground truth information to build a system for generation of semisynthetic manufacturing data that supports benchmarking of causal discovery methods. To accomplish this, we employ distributional random forests in order to flexibly estimate and represent conditional distributions that may be combined into joint distributions that strictly adhere to a causal model over the observed variables. The estimated conditionals and tools for data generation are made available in our Python library . Using the library, we showcase how to benchmark several well-known causal discovery algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/30/2020

RealCause: Realistic Causal Inference Benchmarking

There are many different causal effect estimators in causal inference. H...
research
08/02/2022

CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods

Causal relationships are commonly examined in manufacturing processes to...
research
05/05/2020

A Ladder of Causal Distances

Causal discovery, the task of automatically constructing a causal model ...
research
07/18/2023

Self-Compatibility: Evaluating Causal Discovery without Ground Truth

As causal ground truth is incredibly rare, causal discovery algorithms a...
research
06/04/2019

Neuropathic Pain Diagnosis Simulator for Causal Discovery Algorithm Evaluation

Discovery of causal relations from observational data is essential for m...
research
02/10/2023

On the Interventional Kullback-Leibler Divergence

Modern machine learning approaches excel in static settings where a larg...
research
11/29/2021

Encoding Causal Macrovariables

In many scientific disciplines, coarse-grained causal models are used to...

Please sign up or login with your details

Forgot password? Click here to reset