Computer Vision and Metrics Learning for Hypothesis Testing: An Application of Q-Q Plot for Normality Test

01/23/2019
by   Ke-Wei Huang, et al.
0

This paper proposes a new procedure to construct test statistics for hypothesis testing by computer vision and metrics learning. The application highlighted in this paper is applying computer vision on Q-Q plot to construct a new test statistic for normality test. Traditionally, there are two families of approaches for verifying the probability distribution of a random variable. Researchers either subjectively assess the Q-Q plot or objectively use a mathematical formula, such as Kolmogorov-Smirnov test, to formally conduct a normality test. Graphical assessment by human beings is not rigorous whereas normality test statistics may not be accurate enough when the uniformly most powerful test does not exist. It may take tens of years for statistician to develop a new and more powerful test statistic. The first step of the proposed method is to apply computer vision techniques, such as pre-trained ResNet, to convert a Q-Q plot into a numerical vector. Next step is to apply metric learning to find an appropriate distance function between a Q-Q plot and the centroid of all Q-Q plots under the null hypothesis, which assumes the target variable is normally distributed. This distance metric is the new test statistic for normality test. Our experimentation results show that the machine-learning-based test statistics can outperform traditional normality tests in all cases, particularly when the sample size is small. This study provides convincing evidence that the proposed method could objectively create a powerful test statistic based on Q-Q plots and this method could be modified to construct many more powerful test statistics for other applications in the future.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/29/2021

A Practical Two-Sample Test for Weighted Random Graphs

Network (graph) data analysis is a popular research topic in statistics ...
research
08/26/2018

Bayesian Hypothesis Testing: Redux

Bayesian hypothesis testing is re-examined from the perspective of an a ...
research
01/15/2018

Empirical L^2-distance test statistics for ergodic diffusions

The aim of this paper is to introduce a new type of test statistic for s...
research
11/26/2019

The spatiotemporal tau statistic: a review

Introduction The tau statistic is a recent second-order correlation fu...
research
03/14/2023

A Characterization of Most(More) Powerful Test Statistics with Simple Nonparametric Applications

Data-driven most powerful tests are statistical hypothesis decision-maki...
research
06/20/2022

Multiple Testing Framework for Out-of-Distribution Detection

We study the problem of Out-of-Distribution (OOD) detection, that is, de...
research
05/11/2017

Negative Results in Computer Vision: A Perspective

A negative result is when the outcome of an experiment or a model is not...

Please sign up or login with your details

Forgot password? Click here to reset