Testing Conditional Predictive Independence in Supervised Learning Algorithms

01/28/2019
by   David S. Watson, et al.
0

We propose a general test of conditional independence. The conditional predictive impact (CPI) is a provably consistent and unbiased estimator of one or several features' association with a given outcome, conditional on a (potentially empty) reduced feature set. The measure can be calculated using any supervised learning algorithm and loss function. It relies on no parametric assumptions and applies equally well to continuous and categorical predictors and outcomes. The CPI can be efficiently computed for low- or high-dimensional data without any sparsity constraints. We illustrate PAC-Bayesian convergence rates for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be used in conjunction with causal discovery algorithms to identify underlying graph structures for multivariate systems. We test our method in conjunction with various algorithms, including linear regression, neural networks, random forests, and support vector machines. Empirical results show that the CPI compares favorably to alternative variable importance measures and other nonparametric tests of conditional independence on a diverse array of real and simulated datasets. Simulations confirm that our inference procedures successfully control Type I error and achieve nominal coverage probability. Our method has been implemented in an R package, cpi, which can be downloaded from https://github.com/dswatson/cpi.

READ FULL TEXT
research
11/16/2017

Predictive Independence Testing, Predictive Conditional Independence Testing, and Predictive Graphical Modelling

Testing (conditional) independence of multivariate random variables is a...
research
04/19/2018

The Hardness of Conditional Independence Testing and the Generalised Covariance Measure

It is a common saying that testing for conditional independence, i.e., t...
research
10/20/2021

PPFS: Predictive Permutation Feature Selection

We propose Predictive Permutation Feature Selection (PPFS), a novel wrap...
research
11/25/2021

A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment

Causal inference grows increasingly complex as the number of confounders...
research
05/27/2020

copent: Estimating Copula Entropy and Transfer Entropy in R

Statistical independence and conditional independence are two fundamenta...
research
05/15/2022

Evaluating Independence and Conditional Independence Measures

Independence and Conditional Independence (CI) are two fundamental conce...
research
05/12/2018

A Simple and Effective Model-Based Variable Importance Measure

In the era of "big data", it is becoming more of a challenge to not only...

Please sign up or login with your details

Forgot password? Click here to reset