Why is Differential Evolution Better than Grid Search for Tuning Defect Predictors?

09/08/2016
by   Wei Fu, et al.
0

Context: One of the black arts of data mining is learning the magic parameters which control the learners. In software analytics, at least for defect prediction, several methods, like grid search and differential evolution (DE), have been proposed to learn these parameters, which has been proved to be able to improve the performance scores of learners. Objective: We want to evaluate which method can find better parameters in terms of performance score and runtime cost. Methods: This paper compares grid search to differential evolution, which is an evolutionary algorithm that makes extensive use of stochastic jumps around the search space. Results: We find that the seemingly complete approach of grid search does no better, and sometimes worse, than the stochastic search. When repeated 20 times to check for conclusion validity, DE was over 210 times faster than grid search to tune Random Forests on 17 testing data sets with F-Measure Conclusions: These results are puzzling: why does a quick partial search be just as effective as a much slower, and much more, extensive search? To answer that question, we turned to the theoretical optimization literature. Bergstra and Bengio conjecture that grid search is not more effective than more randomized searchers if the underlying search space is inherently low dimensional. This is significant since recent results show that defect prediction exhibits very low intrinsic dimensionality-- an observation that explains why a fast method like DE may work as well as a seemingly more thorough grid search. This suggests, as a future research direction, that it might be possible to peek at data sets before doing any optimization in order to match the optimization algorithm to the problem at hand.

READ FULL TEXT
research
07/29/2018

While Tuning is Good, No Tuner is Best

Hyperparameter tuning is the black art of automatically finding a good c...
research
07/29/2018

Is One Hyperparameter Optimizer Enough?

Hyperparameter tuning is the black art of automatically finding a good c...
research
08/01/2014

Memetic Search in Differential Evolution Algorithm

Differential Evolution (DE) is a renowned optimization stratagem that ca...
research
08/14/2020

Simpler Hyperparameter Optimization for Software Analytics: Why, How, When?

How to make software analytics simpler and faster? One method is to matc...
research
06/21/2011

Symmetry-Based Search Space Reduction For Grid Maps

In this paper we explore a symmetry-based search space reduction techniq...
research
01/16/2023

Optimizing Predictions for Very Small Data Sets: a case study on Open-Source Project Health Prediction

When learning from very small data sets, the resulting models can make m...
research
03/13/2018

Building Better Quality Predictors Using "ε-Dominance"

Despite extensive research, many methods in software quality prediction ...

Please sign up or login with your details

Forgot password? Click here to reset