A new classification framework for high-dimensional data

06/27/2023
by   Xiangbo Mo, et al.
0

Classification is a classic problem but encounters lots of challenges when dealing with a large number of features, which is common in many modern applications, such as identifying tumor sub-types from genomic data or categorizing customer attitudes based on on-line reviews. We propose a new framework that utilizes the ranks of pairwise distances among observations and identifies a common pattern under moderate to high dimensions that has been overlooked before. The proposed method exhibits superior classification power over existing methods under a variety of scenarios. Furthermore, the proposed method can be applied to non-Euclidean data objects, such as network data. We illustrate the method through an analysis of Neuropixels data where neurons are classified based on their firing activities. Additionally, we explore a related approach that is simpler to understand and investigates key quantities that play essential roles in our novel approach.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset