A Distribution-Free Test of Independence and Its Application to Variable Selection

01/31/2018
by   Hengjian Cui, et al.
0

Motivated by the importance of measuring the association between the response and predictors in high dimensional data, In this article, we propose a new mean variance test of independence between a categorical random variable and a continuous one based on mean variance index. The mean variance index is zero if and only if two variables are independent. Under the independence, we derive an explicit form of its asymptotic null distribution, which provides us with an efficient and fast way to compute the empirical p-value in practice. The number of classes of the categorical variable is allowed to diverge slowly to the infinity. It is essentially a rank test and thus distribution-free. No assumption on the distributions of two random variables is required and the test statistic is invariant under one-to-one transformations. It is resistent to heavy-tailed distributions and extreme values. We assess its performance by Monte Carlo simulations and demonstrate that the proposed test achieves a higher power in comparison with the existing tests. We apply the proposed MV test to a high dimensional colon cancer gene expression data to detect the significant genes associated with the tissue syndrome.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2022

Asymptotic Independence of the Quadratic form and Maximum of Independent Random Variables with Applications to High-Dimensional Tests

This paper establishes the asymptotic independence between the quadratic...
research
05/17/2021

A Distribution Free Conditional Independence Test with Applications to Causal Discovery

This paper is concerned with test of the conditional independence. We fi...
research
11/06/2021

Metric Distributional Discrepancy in Metric Space

Independence analysis is an indispensable step before regression analysi...
research
08/14/2020

Graphical tests of independence for general distributions

We propose two model-free, permutation-based tests of independence betwe...
research
07/06/2023

Geometric Mean Type of Proportional Reduction in Variation Measure for Two-Way Contingency Tables

In a two-way contingency table analysis with explanatory and response va...
research
10/14/2021

A Distribution-Free Independence Test for High Dimension Data

Test of independence is of fundamental importance in modern data analysi...
research
12/05/2022

Testing for Regression Heteroskedasticity with High-Dimensional Random Forests

Statistical inference for high-dimensional regression heteroskedasticity...

Please sign up or login with your details

Forgot password? Click here to reset