Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data

05/04/2010
by   Mohak Shah, et al.
0

One of the objectives of designing feature selection learning algorithms is to obtain classifiers that depend on a small number of attributes and have verifiable future performance guarantees. There are few, if any, approaches that successfully address the two goals simultaneously. Performance guarantees become crucial for tasks such as microarray data analysis due to very small sample sizes resulting in limited empirical evaluation. To the best of our knowledge, such algorithms that give theoretical bounds on the future performance have not been proposed so far in the context of the classification of gene expression data. In this work, we investigate the premise of learning a conjunction (or disjunction) of decision stumps in Occam's Razor, Sample Compression, and PAC-Bayes learning settings for identifying a small subset of attributes that can be used to perform reliable classification tasks. We apply the proposed approaches for gene identification from DNA microarray data and compare our results to those of well known successful approaches proposed for the task. We show that our algorithm not only finds hypotheses with much smaller number of genes while giving competitive classification accuracy but also have tight risk guarantees on future performance unlike other approaches. The proposed approaches are general and extensible in terms of both designing novel algorithms and application to other domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2013

Verdict Accuracy of Quick Reduct Algorithm using Clustering and Classification Techniques for Gene Expression Data

In most gene expression data, the number of training samples is very sma...
research
03/26/2020

A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor Classification

In statistics and machine learning, feature selection is the process of ...
research
07/12/2018

Feature Selection for Gender Classification in TUIK Life Satisfaction Survey

As known, attribute selection is a method that is used before the classi...
research
05/06/2012

TIGRESS: Trustful Inference of Gene REgulation using Stability Selection

Inferring the structure of gene regulatory networks (GRN) from gene expr...
research
04/15/2019

Efficient Feature Selection of Power Quality Events using Two Dimensional (2D) Particle Swarms

A novel two-dimensional (2D) learning framework has been proposed to add...
research
01/26/2019

Sparse evolutionary Deep Learning with over one million artificial neurons on commodity hardware

Microarray gene expression has widely attracted the eyes of the public a...

Please sign up or login with your details

Forgot password? Click here to reset