Fast Multi-Class Probabilistic Classifier by Sparse Non-parametric Density Estimation

01/04/2019
by   Wan-Ping Nicole Chen, et al.
0

The model interpretation is essential in many application scenarios and to build a classification model with a ease of model interpretation may provide useful information for further studies and improvement. It is common to encounter with a lengthy set of variables in modern data analysis, especially when data are collected in some automatic ways. This kinds of datasets may not collected with a specific analysis target and usually contains redundant features, which have no contribution to a the current analysis task of interest. Variable selection is a common way to increase the ability of model interpretation and is popularly used with some parametric classification models. There is a lack of studies about variable selection in nonparametric classification models such as the density estimation-based methods and this is especially the case for multiple-class classification situations. In this study we study multiple-class classification problems using the thought of sparse non-parametric density estimation and propose a method for identifying high impacts variables for each class. We present the asymptotic properties and the computation procedure for the proposed method together with some suggested sample size. We also repost the numerical results using both synthesized and some real data sets.

READ FULL TEXT
research
05/07/2022

Determination of class-specific variables in nonparametric multiple-class classification

As technology advanced, collecting data via automatic collection devices...
research
04/26/2023

High stakes classification with multiple unknown classes based on imperfect data

High stakes classification refers to classification problems where erron...
research
01/05/2023

Screening Methods for Classification Based on Non-parametric Bayesian Tests

Feature or variable selection is a problem inherent to large data sets. ...
research
03/02/2019

Sequential estimation for GEE with adaptive variables and subject selection

Modeling correlated or highly stratified multiple-response data becomes ...
research
03/10/2020

Modeling Multiscale Variable Renewable Energy and Inflow Scenarios in Very Large Regions with Nonparametric Bayesian Networks

In this paper, we propose a non-parametric Bayesian network method to ge...
research
07/08/2021

Moment-based density and risk estimation from grouped summary statistics

Data on a continuous variable are often summarized by means of histogram...
research
10/07/2019

Where to find needles in a haystack?

In many existing methods in multiple comparison, one starts with either ...

Please sign up or login with your details

Forgot password? Click here to reset