Consistency of permutation tests for HSIC and dHSIC
The Hilbert–Schmidt Independence Criterion (HSIC) is a popular measure of the dependency between two random variables. The statistic dHSIC is an extension of HSIC that can be used to test joint independence of d random variables. Such hypothesis testing for (joint) independence is often done using a permutation test, which compares the observed data with randomly permuted datasets. The main contribution of this work is proving that the power of such independence tests converges to 1 as the sample size converges to infinity. This answers a question that was asked in (Pfister, 2018) Additionally this work proves correct type 1 error rate of HSIC and dHSIC permutation tests and provides guidance on how to select the number of permutations one uses in practice. While correct type 1 error rate was already proved in (Pfister, 2018), we provide a modified proof following (Berrett, 2019), which extends to the case of non-continuous data. The number of permutations to use was studied e.g. by (Marozzi, 2004) but not in the context of HSIC and with a slight difference in the estimate of the p-value and for permutations rather than vectors of permutations. While the last two points have limited novelty we include these to give a complete overview of permutation testing in the context of HSIC and dHSIC.
READ FULL TEXT