Simpler Hyperparameter Optimization for Software Analytics: Why, How, When?

08/14/2020
by   Amritanshu Agrawal, et al.
0

How to make software analytics simpler and faster? One method is to match the complexity of analysis to the intrinsic complexity of the data being explored. For example, hyperparameter optimizers find the control settings for data miners that improve for improving the predictions generated via software analytics. Sometimes, very fast hyperparameter optimization can be achieved by just DODGE-ing away from things tried before. But when is it wise to use DODGE and when must we use more complex (and much slower) optimizers? To answer this, we applied hyperparameter optimization to 120 SE data sets that explored bad smell detection, predicting Github ssue close time, bug report analysis, defect prediction, and dozens of other non-SE problems. We find that DODGE works best for data sets with low "intrinsic dimensionality" (D = 3) and very poorly for higher-dimensional data (D over 8). Nearly all the SE data seen here was intrinsically low-dimensional, indicating that DODGE is applicable for many SE analytics tasks.

READ FULL TEXT
research
12/09/2019

Is AI different for SE?

What AI tools are needed for SE? Ideally, we should have simple rules th...
research
01/16/2023

Optimizing Predictions for Very Small Data Sets: a case study on Open-Source Project Health Prediction

When learning from very small data sets, the resulting models can make m...
research
07/29/2018

Is One Hyperparameter Optimizer Enough?

Hyperparameter tuning is the black art of automatically finding a good c...
research
04/27/2018

Can You Explain That, Better? Comprehensible Text Analytics for SE Applications

Text mining methods are used for a wide range of Software Engineering (S...
research
09/08/2016

Why is Differential Evolution Better than Grid Search for Tuning Defect Predictors?

Context: One of the black arts of data mining is learning the magic para...
research
09/04/2023

Model Review: A PROMISEing Opportunity

To make models more understandable and correctable, I propose that the P...
research
02/03/2023

Less, but Stronger: On the Value of Strong Heuristics in Semi-supervised Learning for Software Analytics

In many domains, there are many examples and far fewer labels for those ...

Please sign up or login with your details

Forgot password? Click here to reset