Population-Guided Large Margin Classifier for High-Dimension Low -Sample-Size Problems

01/05/2019
by   Qingbo Yin, et al.
0

Various applications in different fields, such as gene expression analysis or computer vision, suffer from data sets with high-dimensional low-sample-size (HDLSS), which has posed significant challenges for standard statistical and modern machine learning methods. In this paper, we propose a novel linear binary classifier, denoted by population-guided large margin classifier (PGLMC), which is applicable to any sorts of data, including HDLSS. PGLMC is conceived with a projecting direction w given by the comprehensive consideration of local structural information of the hyperplane and the statistics of the training samples. Our proposed model has several advantages compared to those widely used approaches. First, it is not sensitive to the intercept term b. Second, it operates well with imbalanced data. Third, it is relatively simple to be implemented based on Quadratic Programming. Fourth, it is robust to the model specification for various real applications. The theoretical properties of PGLMC are proven. We conduct a series of evaluations on two simulated and six real-world benchmark data sets, including DNA classification, digit recognition, medical image analysis, and face recognition. PGLMC outperforms the state-of-the-art classification methods in most cases, or at least obtains comparable results.

READ FULL TEXT
research
06/21/2020

The classification for High-dimension low-sample size data

Huge amount of applications in various fields, such as gene expression a...
research
09/10/2020

Population structure-learned classifier for high-dimension low-sample-size class-imbalanced problem

The Classification on high-dimension low-sample-size data (HDLSS) is a c...
research
10/11/2013

Flexible High-dimensional Classification Machines and Their Asymptotic Properties

Classification is an important topic in statistics and machine learning ...
research
08/30/2020

diproperm: An R Package for the DiProPerm Test

High-dimensional low sample size (HDLSS) data sets emerge frequently in ...
research
08/13/2019

Comparison theorems on large-margin learning

This paper studies binary classification problem associated with a famil...
research
05/19/2023

A Foray into Parallel Optimisation Algorithms for High Dimension Low Sample Space Generalized Distance Weighted Discrimination problems

In many modern data sets, High dimension low sample size (HDLSS) data is...
research
01/08/2020

On a Generalization of the Average Distance Classifier

In high dimension, low sample size (HDLSS)settings, the simple average d...

Please sign up or login with your details

Forgot password? Click here to reset