Robust Model Selection with Application in Single-Cell Multiomics Data

05/09/2023
by   Zhanrui Cai, et al.
0

Model selection is critical in the modern statistics and machine learning community. However, most existing works do not apply to heavy-tailed data, which are commonly encountered in real applications, such as the single-cell multiomics data. In this paper, we propose a rank-sum based approach that outputs a confidence set containing the optimal model with guaranteed probability. Motivated by conformal inference, we developed a general method that is applicable without moment or tail assumptions on the data. We demonstrate the advantage of the proposed method through extensive simulation and a real application on the COVID-19 genomics dataset (Stephenson et al., 2021). To perform the inference on rank-sum statistics, we derive a general Gaussian approximation theory for high dimensional two-sample U-statistics, which may be of independent interest to the statistics and machine learning community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/26/2022

Robust distance correlation for variable screening

High-dimensional data are commonly seen in modern statistical applicatio...
research
10/14/2021

A Distribution-Free Independence Test for High Dimension Data

Test of independence is of fundamental importance in modern data analysi...
research
05/25/2023

Learning Robust Statistics for Simulation-based Inference under Model Misspecification

Simulation-based inference (SBI) methods such as approximate Bayesian co...
research
08/20/2018

Use Of Vapnik-Chervonenkis Dimension in Model Selection

In this dissertation, I derive a new method to estimate the Vapnik-Cherv...
research
08/12/2018

Robust high dimensional factor models with applications to statistical machine learning

Factor models are a class of powerful statistical models that have been ...
research
12/04/2017

Episodic memory for continual model learning

Both the human brain and artificial learning agents operating in real-wo...
research
08/25/2020

Powerful Inference

We develop an inference method for a (sub)vector of parameters identifie...

Please sign up or login with your details

Forgot password? Click here to reset