Learning GENERAL Principles from Hundreds of Software Projects

11/06/2019
by   Suvodeep Majumder, et al.
0

When one exemplar project, which we call the "bellwether", offers the best advice then it can be used to offer advice for many other projects. Such bellwethers can be used to make quality predictions about new projects, even before there is much experience with those new projects. But existing methods for bellwether transfer are very slow. When applied to the 697 projects studied here, they took 60 days of CPU to find and certify the bellwethers. Hence, we propose a GENERAL: a novel bellwether detection algorithm based on hierarchical clustering. At each level within a tree of clusters, one bellwether is computed from sibling projects, then promoted up the tree. This hierarchical method is a scalable approach to learning effective models from very large data sets. For example, for nearly 700 projects, the defect prediction models generated from GENERAL's bellwether were just as good as those found via standard methods.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 6

page 7

page 9

page 13

research
11/26/2020

Early Life Cycle Software Defect Prediction. Why? How?

Many researchers assume that, for software analytics, "more data is bett...
research
03/22/2018

Pando: a Volunteer Computing Platform for the Web

Volunteer computing is currently successfully used to make hundreds of t...
research
05/24/2021

The Early Bird Catches the Worm: Better Early Life Cycle Defect Predictors

Before researchers rush to reason across all available data, they should...
research
09/28/2022

Feature Sets in Just-in-Time Defect Prediction: An Empirical Evaluation

Just-in-time defect prediction assigns a defect risk to each new change ...
research
04/30/2021

Participatory Budgeting with Donations and Diversity Constraints

Participatory budgeting (PB) is a democratic process where citizens join...
research
04/06/2018

Bayesian Hierarchical Modelling for Tailoring Metric Thresholds

Software is highly contextual. While there are cross-cutting `global' le...
research
08/21/2020

Revisiting Process versus Product Metrics: a Large Scale Analysis

Numerous methods can build predictive models from software data. But wha...

Please sign up or login with your details

Forgot password? Click here to reset