Testing Independence under Biased Sampling

by   Yaniv Tenzer, et al.

Testing for association or dependence between pairs of random variables is a fundamental problem in statistics. In some applications, data are subject to selection bias that causes dependence between observations even when it is absent from the population. An important example is truncation models, in which observed pairs are restricted to a specific subset of the X-Y plane. Standard tests for independence are not suitable in such cases, and alternative tests that take the selection bias into account are required. To deal with this issue, we generalize the notion of quasi-independence with respect to the sampling mechanism, and study the problem of detecting any deviations from it. We develop a test motivated by the classic Hoeffding's statistic, and use two approaches to compute its distribution under the null: (i) a bootstrap-based approach and (ii) an exact permutation-test with non-uniform probability of permutations. We prove the validity of the tests, and show, using simulations, that they perform very well for important special cases of the problem and achieve improved power compared to competing methods. The tests are applied to four datasets, two that are subject to truncation, one that is subject to length bias and one with a special bias mechanism.


A kernel test for quasi-independence

We consider settings in which the data of interest correspond to pairs o...

Multi-characteristic Subject Selection from Biased Datasets

Subject selection plays a critical role in experimental studies, especia...

Testing for publication bias in meta-analysis under Copas selection model

In meta-analyses, publication bias is a well-known, important and challe...

A Kernel Independence Test for Random Processes

A new non parametric approach to the problem of testing the independence...

BET on Independence

We study the problem of nonparametric dependence detection. Many existin...

Is there Anisotropy in Structural Bias?

Structural Bias (SB) is an important type of algorithmic deficiency with...

Graphical tests of independence for general distributions

We propose two model-free, permutation-based tests of independence betwe...

Please sign up or login with your details

Forgot password? Click here to reset