DeepAI AI Chat
Log In Sign Up

A Nonparametric Normality Test for High-dimensional Data

by   Hao Chen, et al.
University of California-Davis
FUDAN University

Many statistical methodologies for high-dimensional data assume the population normality. Although a few multivariate normality tests have been proposed, they either suffer from low power or have serious size distortion when the dimension is high. In this work, we propose a novel nonparametric test that extends from graph-based two-sample tests by utilizing the nearest neighbor information. Theoretical results guarantee the type I error control of the proposed test when the dimension is growing with the number of observations. Simulation studies verify the empirical size performance of the proposed test when the dimension is larger than the sample size and at the same time exhibit the superior power performance of the new test compared with the alternative methods. We also illustrate our approach through a popularly used lung cancer data set in high-dimensional classification literatures where deviation from the normality assumption may lead to completely invalid conclusion.


page 1

page 2

page 3

page 4


Testing Overidentifying Restrictions with High-Dimensional Data and Heteroskedasticity

This paper proposes a new test of overidentifying restrictions (called t...

Generalized Multivariate Signs for Nonparametric Hypothesis Testing in High Dimensions

High-dimensional data, where the dimension of the feature space is much ...

A More Powerful Two-Sample Test in High Dimensions using Random Projection

We consider the hypothesis testing problem of detecting a shift between ...

Classification with Ultrahigh-Dimensional Features

Although much progress has been made in classification with high-dimensi...

Two-Sample Test Based on Classification Probability

Robust classification algorithms have been developed in recent years wit...

Nonparametric High-dimensional K-sample Comparison

High-dimensional k-sample comparison is a common applied problem. We con...

Limiting distributions of graph-based test statistics

Two-sample tests utilizing a similarity graph on observations are useful...