Asymptotic Analysis of an Ensemble of Randomly Projected Linear Discriminants

04/17/2020
by   Lama B. Niyazi, et al.
0

Datasets from the fields of bioinformatics, chemometrics, and face recognition are typically characterized by small samples of high-dimensional data. Among the many variants of linear discriminant analysis that have been proposed in order to rectify the issues associated with classification in such a setting, the classifier in [1], composed of an ensemble of randomly projected linear discriminants, seems especially promising; it is computationally efficient and, with the optimal projection dimension parameter setting, is competitive with the state-of-the-art. In this work, we seek to further understand the behavior of this classifier through asymptotic analysis. Under the assumption of a growth regime in which the dataset and projection dimensions grow at constant rates to each other, we use random matrix theory to derive asymptotic misclassification probabilities showing the effect of the ensemble as a regularization of the data sample covariance matrix. The asymptotic errors further help to identify situations in which the ensemble offers a performance advantage. We also develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator, which is conventionally used for parameter tuning. Finally, we demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/03/2016

High-Dimensional Regularized Discriminant Analysis

Regularized discriminant analysis (RDA), proposed by Friedman (1989), is...
research
10/01/2021

Weight Vector Tuning and Asymptotic Analysis of Binary Linear Classifiers

Unlike its intercept, a linear classifier's weight vector cannot be tune...
research
01/03/2022

On randomized sketching algorithms and the Tracy-Widom law

There is an increasing body of work exploring the integration of random ...
research
10/05/2021

Classification of high-dimensional data with spiked covariance matrix structure

We study the classification problem for high-dimensional data with n obs...
research
02/26/2021

TEC: Tensor Ensemble Classifier for Big Data

Tensor (multidimensional array) classification problem has become very p...
research
03/08/2021

Large-Sample Properties of Blind Estimation of the Linear Discriminant Using Projection Pursuit

We study the estimation of the linear discriminant with projection pursu...
research
06/11/2020

Improved Design of Quadratic Discriminant Analysis Classifier in Unbalanced Settings

The use of quadratic discriminant analysis (QDA) or its regularized vers...

Please sign up or login with your details

Forgot password? Click here to reset