Probabilistic Feature Selection and Classification Vector Machine

09/18/2016
by   Bingbing Jiang, et al.
0

Sparse Bayesian learning is one of the state-of- the-art machine learning algorithms, which is able to make stable and reliable probabilistic predictions. However, some of these algorithms, e.g. probabilistic classification vector machine (PCVM) and relevant vector machine (RVM), are not capable of eliminating irrelevant and redundant features which could lead to performance degradation. To tackle this problem, in this paper, we propose a sparse Bayesian classifier which simultaneously selects the relevant samples and features. We name this classifier a probabilistic feature selection and classification vector machine (PFCVM), in which truncated Gaussian distributions are em- ployed as both sample and feature priors. In order to derive the analytical solution for the proposed algorithm, we use Laplace approximation to calculate approximate posteriors and marginal likelihoods. Finally, we obtain the optimized parameters and hyperparameters by the type-II maximum likelihood method. The experiments on synthetic data set, benchmark data sets and high dimensional data sets validate the performance of PFCVM under two criteria: accuracy of classification and efficacy of selected features. Finally, we analyze the generalization performance of PFCVM and derive a generalization error bound for PFCVM. Then by tightening the bound, we demonstrate the significance of the sparseness for the model.

READ FULL TEXT
research
08/07/2018

Mixed Integer Linear Programming for Feature Selection in Support Vector Machine

This work focuses on support vector machine (SVM) with feature selection...
research
01/10/2013

Heteroscedastic Relevance Vector Machine

In this work we propose a heteroscedastic generalization to RVM, a fast ...
research
09/12/2018

But How Does It Work in Theory? Linear SVM with Random Features

We prove that, under low noise assumptions, the support vector machine w...
research
06/15/2021

Employing an Adjusted Stability Measure for Multi-Criteria Model Fitting on Data Sets with Similar Features

Fitting models with high predictive accuracy that include all relevant b...
research
03/27/2019

Stable prediction with radiomics data

Motivation: Radiomics refers to the high-throughput mining of quantitati...
research
10/19/2012

A Distance-Based Branch and Bound Feature Selection Algorithm

There is no known efficient method for selecting k Gaussian features fro...
research
07/26/2020

Fully Bayesian Analysis of the Relevance Vector Machine Classification for Imbalanced Data

Relevance Vector Machine (RVM) is a supervised learning algorithm extend...

Please sign up or login with your details

Forgot password? Click here to reset