Generalized Kernel Two-Sample Tests

11/12/2020
by   Hoseung Song, et al.
0

Kernel two-sample tests have been widely used for multivariate data in testing equal distribution. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space do not work well for some common alternatives when the dimension of the data is moderate to high due to the curse of dimensionality. We propose a new test statistic that makes use of an informative pattern under moderate and high dimensions and achieves substantial power improvements over existing kernel two-sample tests for a wide range of alternatives. We also propose alternative testing procedures that maintain high power with low computational cost, offering easy off-the-shelf tools for large datasets. We illustrate these new approaches through an analysis of the New York City taxi data.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset