A Nonparametric Normality Test for High-dimensional Data

04/10/2019
by   Hao Chen, et al.
0

Many statistical methodologies for high-dimensional data assume the population normality. Although a few multivariate normality tests have been proposed, they either suffer from low power or have serious size distortion when the dimension is high. In this work, we propose a novel nonparametric test that extends from graph-based two-sample tests by utilizing the nearest neighbor information. Theoretical results guarantee the type I error control of the proposed test when the dimension is growing with the number of observations. Simulation studies verify the empirical size performance of the proposed test when the dimension is larger than the sample size and at the same time exhibit the superior power performance of the new test compared with the alternative methods. We also illustrate our approach through a popularly used lung cancer data set in high-dimensional classification literatures where deviation from the normality assumption may lead to completely invalid conclusion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2022

Testing Overidentifying Restrictions with High-Dimensional Data and Heteroskedasticity

This paper proposes a new test of overidentifying restrictions (called t...
research
07/02/2021

Generalized Multivariate Signs for Nonparametric Hypothesis Testing in High Dimensions

High-dimensional data, where the dimension of the feature space is much ...
research
06/24/2023

Robust Classification of High-Dimensional Data using Data-Adaptive Energy Distance

Classification of high-dimensional low sample size (HDLSS) data poses a ...
research
11/04/2016

Classification with Ultrahigh-Dimensional Features

Although much progress has been made in classification with high-dimensi...
research
09/17/2019

Two-Sample Test Based on Classification Probability

Robust classification algorithms have been developed in recent years wit...
research
10/03/2018

Nonparametric High-dimensional K-sample Comparison

High-dimensional k-sample comparison is a common applied problem. We con...
research
12/04/2021

Revisiting k-Nearest Neighbor Graph Construction on High-Dimensional Data : Experiments and Analyses

The k-nearest neighbor graph (KNNG) on high-dimensional data is a data s...

Please sign up or login with your details

Forgot password? Click here to reset