Gene selection for cancer classification using a hybrid of univariate and multivariate feature selection methods

06/05/2015
by   Min Xu, et al.
0

Various approaches to gene selection for cancer classification based on microarray data can be found in the literature and they may be grouped into two categories: univariate methods and multivariate methods. Univariate methods look at each gene in the data in isolation from others. They measure the contribution of a particular gene to the classification without considering the presence of the other genes. In contrast, multivariate methods measure the relative contribution of a gene to the classification by taking the other genes in the data into consideration. Multivariate methods select fewer genes in general. However, the selection process of multivariate methods may be sensitive to the presence of irrelevant genes, noises in the expression and outliers in the training data. At the same time, the computational cost of multivariate methods is high. To overcome the disadvantages of the two types of approaches, we propose a hybrid method to obtain gene sets that are small and highly discriminative. We devise our hybrid method from the univariate Maximum Likelihood method (LIK) and the multivariate Recursive Feature Elimination method (RFE). We analyze the properties of these methods and systematically test the effectiveness of our proposed method on two cancer microarray datasets. Our experiments on a leukemia dataset and a small, round blue cell tumors dataset demonstrate the effectiveness of our hybrid method. It is able to discover sets consisting of fewer genes than those reported in the literature and at the same time achieve the same or better prediction accuracy.

READ FULL TEXT
research
05/04/2023

Fuzzy Gene Selection and Cancer Classification Based on Deep Learning Model

Machine learning (ML) approaches have been used to develop highly accura...
research
04/29/2021

Genotype-Guided Radiomics Signatures for Recurrence Prediction of Non-Small-Cell Lung Cancer

Non-small cell lung cancer (NSCLC) is a serious disease and has a high r...
research
05/20/2019

A Comparative Analysis of Feature Selection Methods for Biomarker Discovery in Study of Toxicant-treated Atlantic Cod (Gadus morhua) Liver

Univariate and multivariate feature selection methods can be used for bi...
research
11/03/2021

Multivariate feature ranking of gene expression data

Gene expression datasets are usually of high dimensionality and therefor...
research
02/07/2013

Feature Selection for Microarray Gene Expression Data using Simulated Annealing guided by the Multivariate Joint Entropy

In this work a new way to calculate the multivariate joint entropy is pr...
research
07/12/2012

Biogeography-Based Informative Gene Selection and Cancer Classification Using SVM and Random Forests

Microarray cancer gene expression data comprise of very high dimensions....
research
02/09/2019

Inverse Projection Representation and Category Contribution Rate for Robust Tumor Recognition

Sparse representation based classification (SRC) methods have achieved r...

Please sign up or login with your details

Forgot password? Click here to reset