A Stable Combinatorial Particle Swarm Optimization for Scalable Feature Selection in Gene Expression Data
Evolutionary computation (EC) algorithms, such as discrete and multi-objective versions of particle swarm optimization (PSO), have been applied to solve the Feature selection (FS) problem, tackling the combinatorial explosion of search spaces that are peppered with local minima. Furthermore, high-dimensional FS problems such as finding a small set of biomarkers to make a diagnostic call add an additional challenge as such methods ability to pick out the most important features must remain unchanged in decision spaces of increasing dimensions and presence of irrelevant features. We developed a combinatorial PSO algorithm, called COMB-PSO, that scales up to high-dimensional gene expression data while still selecting the smallest subsets of genes that allow reliable classification of samples. In particular, COMB-PSO enhances the encoding, speed of convergence, control of divergence and diversity of the conventional PSO algorithm, balancing exploration and exploitation of the search space. Applying our approach on real gene expression data of different cancers, COMB-PSO finds gene sets of smallest size that allow a reliable classification of the underlying disease classes.
READ FULL TEXT