USP: an independence test that improves on Pearson's chi-squared and the G-test

01/26/2021
by   Thomas B. Berrett, et al.
4

We present the U-Statistic Permutation (USP) test of independence in the context of discrete data displayed in a contingency table. Either Pearson's chi-squared test of independence, or the G-test, are typically used for this task, but we argue that these tests have serious deficiencies, both in terms of their inability to control the size of the test, and their power properties. By contrast, the USP test is guaranteed to control the size of the test at the nominal level for all sample sizes, has no issues with small (or zero) cell counts, and is able to detect distributions that violate independence in only a minimal way. The test statistic is derived from a U-statistic estimator of a natural population measure of dependence, and we prove that this is the unique minimum variance unbiased estimator of this population quantity. The practical utility of the USP test is demonstrated on both simulated data, where its power can be dramatically greater than those of Pearson's test and the G-test, and on real data. The USP test is implemented in the R package USP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/15/2020

Optimal rates for independence testing via U-statistic permutation tests

We study the problem of independence testing given independent and ident...
research
07/19/2022

A Normal Test for Independence via Generalized Mutual Information

Testing hypothesis of independence between two random elements on a join...
research
08/28/2018

Seven proofs of the Pearson Chi-squared independence test and its graphical interpretation

This paper revisits the Pearson Chi-squared independence test. After pre...
research
09/18/2020

An Independence Test Based on Recurrence Rates. An empirical study and applications to real data

In this paper we propose several variants to perform the independence te...
research
07/03/2022

Testing Homogeneity: The Trouble with Sparse Functional Data

Testing the homogeneity between two samples of functional data is an imp...
research
03/04/2022

On Uses of Van der Waerden Test: A Graphical Approach

Although several nonparametric tests are available for testing populatio...
research
06/08/2018

Hadamard Matrices, Quaternions, and the Pearson Chi-square Statistic

We present a symbolic decomposition of the Pearson chi-square statistic ...

Please sign up or login with your details

Forgot password? Click here to reset