Multiple multi-sample testing under arbitrary covariance dependency

02/24/2022
by   Vladimir Vutov, et al.
0

Modern high-throughput biomedical devices routinely produce data on a large scale, and the analysis of high-dimensional datasets has become commonplace in biomedical studies. However, given thousands or tens of thousands of measured variables in these datasets, extracting meaningful features poses a challenge. In this article, we propose a procedure to evaluate the strength of the associations between a nominal (categorical) response variable and multiple features simultaneously. Specifically, we propose a framework of large-scale multiple testing under arbitrary correlation dependency among test statistics. First, marginal multinomial regressions are performed for each feature individually. Second, we use an approach of multiple marginal models for each baseline-category pair to establish asymptotic joint normality of the stacked vector of the marginal multinomial regression coefficients. Third, we estimate the (limiting) covariance matrix between the estimated coefficients from all marginal models. Finally, our approach approximates the realized false discovery proportion of a thresholding procedure for the marginal p-values, for each baseline-category pair. The proposed approach offers a sensible trade-off between the expected numbers of true and false rejections. Furthermore, we demonstrate a practical application of the method on hyperspectral imaging data. This dataset is obtained by a matrix-assisted laser desorption/ionization (MALDI) instrument. MALDI demonstrates tremendous potential for clinical diagnosis, particularly for cancer research. In our application, the nominal response categories represent cancer subtypes.

READ FULL TEXT

page 1

page 12

research
08/18/2021

Multiple two-sample testing under arbitrary covariance dependency with an application in imaging mass spectrometry

Large-scale hypothesis testing has become a ubiquitous problem in high-d...
research
08/18/2022

A Decorrelating and Debiasing Approach to Simultaneous Inference for High-Dimensional Confounded Models

Motivated by the simultaneous association analysis with the presence of ...
research
06/17/2021

Large-Scale Multiple Testing for Matrix-Valued Data under Double Dependency

High-dimensional inference based on matrix-valued data has drawn increas...
research
12/04/2022

Inferring on joint associations from marginal associations and a reference sample

We present a method to infer on joint regression coefficients obtained f...
research
03/14/2023

Robust Multiple Testing under High-dimensional Dynamic Factor Model

Large-scale multiple testing under static factor models is commonly used...
research
11/11/2021

Simulating High-Dimensional Multivariate Data using the bigsimr R Package

It is critical to accurately simulate data when employing Monte Carlo te...
research
11/21/2022

Neural Dependencies Emerging from Learning Massive Categories

This work presents two astonishing findings on neural networks learned f...

Please sign up or login with your details

Forgot password? Click here to reset