Incorporating Causal Graphical Prior Knowledge into Predictive Modeling via Simple Data Augmentation

02/27/2021
by   Takeshi Teshima, et al.
0

Causal graphs (CGs) are compact representations of the knowledge of the data generating processes behind the data distributions. When a CG is available, e.g., from the domain knowledge, we can infer the conditional independence (CI) relations that should hold in the data distribution. However, it is not straightforward how to incorporate this knowledge into predictive modeling. In this work, we propose a model-agnostic data augmentation method that allows us to exploit the prior knowledge of the CI encoded in a CG for supervised machine learning. We theoretically justify the proposed method by providing an excess risk bound indicating that the proposed method suppresses overfitting by reducing the apparent complexity of the predictor hypothesis class. Using real-world data with CGs provided by domain experts, we experimentally show that the proposed method is effective in improving the prediction accuracy, especially in the small-data regime.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2023

A Guide for Practical Use of ADMG Causal Data Augmentation

Data augmentation is essential when applying Machine Learning in small-d...
research
06/27/2012

Incorporating Causal Prior Knowledge as Path-Constraints in Bayesian Networks and Maximal Ancestral Graphs

We consider the incorporation of causal knowledge about the presence or ...
research
05/09/2012

Domain Knowledge Uncertainty and Probabilistic Parameter Constraints

Incorporating domain knowledge into the modeling process is an effective...
research
09/30/2019

On Incorporating Semantic Prior Knowlegde in Deep Learning Through Embedding-Space Constraints

The knowledge that humans hold about a problem often extends far beyond ...
research
11/16/2021

Learning Augmentation Distributions using Transformed Risk Minimization

Adapting to the structure of data distributions (such as symmetry and tr...
research
09/10/2023

Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation From Scratch

Unsupervised contrastive learning methods have recently seen significant...
research
01/26/2023

Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

A directed acyclic graph (DAG) provides valuable prior knowledge that is...

Please sign up or login with your details

Forgot password? Click here to reset