Variable screening based on Gaussian Centered L-moments

08/29/2019
by   Hyowon An, et al.
0

An important challenge in big data is identification of important variables. In this paper, we propose methods of discovering variables with non-standard univariate marginal distributions. The conventional moments-based summary statistics can be well-adopted for that purpose, but their sensitivity to outliers can lead to selection based on a few outliers rather than distributional shape such as bimodality. To address this type of non-robustness, we consider the L-moments. Using these in practice, however, has a limitation because they do not take zero values at the Gaussian distributions to which the shape of a marginal distribution is most naturally compared. As a remedy, we propose Gaussian Centered L-moments which share advantages of the L-moments but have zeros at the Gaussian distributions. The strength of Gaussian Centered L-moments over other conventional moments is shown in theoretical and practical aspects such as their performances in screening important genes in cancer genetics data.

READ FULL TEXT
research
05/02/2020

An efficient and accurate approximation to the distribution of quadratic forms of Gaussian variables

Fast and accurate calculation for the distributions of Quadratic forms o...
research
04/05/2022

Method of Winsorized Moments for Robust Fitting of Truncated and Censored Lognormal Distributions

When constructing parametric models to predict the cost of future claims...
research
11/24/2022

A Non-Gaussian Bayesian Filter Using Power and Generalized Logarithmic Moments

In our previous paper, we proposed a non-Gaussian Bayesian filter using ...
research
08/31/2017

Sketching the order of events

We introduce features for massive data streams. These stream features ca...
research
07/20/2020

Mixed Moments for the Product of Ginibre Matrices

We study the ensemble of a product of n complex Gaussian i.i.d. matrices...
research
02/05/2018

Copula-based Partial Correlation Screening: a Joint and Robust Approach

Screening for ultrahigh dimensional features may encounter complicated i...
research
04/29/2018

Efficient Calculation of Meta Distributions and the Performance of User Percentiles

Meta distributions (MDs) are refined performance metrics in wireless net...

Please sign up or login with your details

Forgot password? Click here to reset