Surrogate-Assisted Genetic Algorithm for Wrapper Feature Selection

Feature selection is an intractable problem, therefore practical algorithms often trade off the solution accuracy against the computation time. In this paper, we propose a novel multi-stage feature selection framework utilizing multiple levels of approximations, or surrogates. Such a framework allows for using wrapper approaches in a much more computationally efficient way, significantly increasing the quality of feature selection solutions achievable, especially on large datasets. We design and evaluate a Surrogate-Assisted Genetic Algorithm (SAGA) which utilizes this concept to guide the evolutionary search during the early phase of exploration. SAGA only switches to evaluating the original function at the final exploitation phase. We prove that the run-time upper bound of SAGA surrogate-assisted stage is at worse equal to the wrapper GA, and it scales better for induction algorithms of high order of complexity in number of instances. We demonstrate, using 14 datasets from the UCI ML repository, that in practice SAGA significantly reduces the computation time compared to a baseline wrapper Genetic Algorithm (GA), while converging to solutions of significantly higher accuracy. Our experiments show that SAGA can arrive at near-optimal solutions three times faster than a wrapper GA, on average. We also showcase the importance of evolution control approach designed to prevent surrogates from misleading the evolutionary search towards false optima.

READ FULL TEXT
research
10/31/2022

Exploring the effectiveness of surrogate-assisted evolutionary algorithms on the batch processing problem

Real-world optimisation problems typically have objective functions whic...
research
04/28/2017

A Tribe Competition-Based Genetic Algorithm for Feature Selection in Pattern Classification

Feature selection has always been a critical step in pattern recognition...
research
11/10/2011

Genetic Algorithm (GA) in Feature Selection for CRF Based Manipuri Multiword Expression (MWE) Identification

This paper deals with the identification of Multiword Expressions (MWEs)...
research
07/08/2023

Time-limited Metaheuristics for Cardinality-constrained Portfolio Optimisation

A financial portfolio contains assets that offer a return with a certain...
research
05/31/2022

Towards Explainable Metaheuristic: Mining Surrogate Fitness Models for Importance of Variables

Metaheuristic search algorithms look for solutions that either maximise ...
research
06/06/2019

Enhancing Multi-model Inference with Natural Selection

Multi-model inference covers a wide range of modern statistical applicat...
research
10/22/2021

Adaptability of Improved NEAT in Variable Environments

A large challenge in Artificial Intelligence (AI) is training control ag...

Please sign up or login with your details

Forgot password? Click here to reset