Hypothesis testing procedures for two sample means with applications to gene expression data

12/29/2018
by   Michail Tsagris, et al.
0

In Bioinformatics, the number of available variables for a few tens of subjects, is usually in the order of tens of thousands. As an example is the case of gene expression data, where usually two groups of subjects exist, cases and controls or subjects with disease and subjects without disease. The detection of differentially expressed genes between the two groups takes place using many 2 independent samples (Welch) t-tests, one test for each variable (probeset). Motivated by this, the present research examines the empirical and exponential empirical likelihood, asymptotically, and provides some useful results revealing their relationship with the James's and Welch t-test. By exploiting this relationship, a simple calibration based on the t distribution, applicable to both techniques, is proposed. Then, this calibration is compared to the classical Welch t-test. A third, more famous, non parametric test subject to comparison is the Wilcoxn-Mann-Whitney test. As an extra step, bootstrap calibration of the aforementioned tests is performed and the exact p-value of the Wilcoxn-Mann-Whitney test is computed. The main goal is to examine the size and the power behaviour of these testing procedures, when applied on small to medium sized datasets. Based on extensive simulation studies we provide strong evidence for the Welch t-test. We show, numerically, that the Welch t-test has the same power abilities with all other testing procedures. It outperforms them though in terms of attaining the type I error. Further, it is computationally extremely efficient.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2020

A Monte Carlo comparison of categorical tests of independence

The X^2 and G^2 tests are the most frequently applied tests for testing ...
research
11/12/2020

A Bootstrap Based Between-Study Heterogeneity Test in Meta-Analysis

Meta-analysis combines pertinent information from existing studies to pr...
research
05/17/2021

A powerful test for differentially expressed gene pathways via graph-informed structural equation modeling

A major task in genetic studies is to identify genes related to human di...
research
03/18/2018

Testing for equal correlation matrices with application to paired gene expression data

We present a novel method for testing the hypothesis of equality of two ...
research
11/22/2022

Optimal design of the Wilcoxon-Mann-Whitney-test

In scientific research, many hypotheses relate to the comparison of two ...
research
05/17/2021

What makes you unique?

This paper proposes a uniqueness Shapley measure to compare the extent t...
research
06/26/2011

A New General Method to Generate Random Modal Formulae for Testing Decision Procedures

The recent emergence of heavily-optimized modal decision procedures has ...

Please sign up or login with your details

Forgot password? Click here to reset