D&C: A Divide-and-Conquer Approach to IR-based Bug Localization

02/07/2019
by   Anil Koyuncu, et al.
0

Many automated tasks in software maintenance rely on information retrieval techniques to identify specific information within unstructured data. Bug localization is such a typical task, where text in a bug report is analyzed to identify file locations in the source code that can be associated to the reported bug. Despite the promising results, the performance offered by IR-based bug localization tools is still not significant for large adoption. We argue that one reason could be the attempt to build a one-size-fits-all approach. In this paper, we extensively study the performance of state-of-the-art bug localization tools, focusing on query formulation and its importance with respect to the localization performance. Building on insights from this study, we propose a new learning approach where multiple classifier models are trained on clear-cut sets of bug-location pairs. Concretely, we apply a gradient boosting supervised learning approach to various sets of bug reports whose localizations appear to be successful with specific types of features. The training scenario builds on our findings that the various state-of-the-art localization tools can be highly performant for specific sets of bug reports. We implement D&C, which computes appropriate weights that should be assigned to the similarity measurements between pairs of information token types. Experimental results on large and up-to-date datasets reveal that D&C outperforms state-of-the-art tools. On average, the experiments yield an MAP score of 0.52, and an MRR score of 0.63 with a curated dataset, which provides a substantial performance improvement over all tools: MAP is improved by between 4 and up to 10 percentage points, while MRR is improved by between 1 and up to 12. Finally, we note that D&C is stable in its localization performance: around 50 Top10.

READ FULL TEXT
research
04/22/2021

An Extensive Study on Smell-Aware Bug Localization

Bug localization is an important aspect of software maintenance because ...
research
08/01/2018

Improving IR-Based Bug Localization with Context-Aware Query Reformulation

Recent findings suggest that Information Retrieval (IR)-based bug locali...
research
05/09/2023

RLocator: Reinforcement Learning for Bug Localization

Software developers spend a significant portion of time fixing bugs in t...
research
08/08/2018

A Case Study on the Impact of Similarity Measure on Information Retrieval based Software Engineering Tasks

Information Retrieval (IR) plays a pivotal role in diverse Software Engi...
research
06/20/2018

The Impact of IR-based Classifier Configuration on the Performance and the Effort of Method-Level Bug Localization

Context: IR-based bug localization is a classifier that assists develope...
research
02/28/2023

Large-Scale Evaluation of Method-Level Bug Localization with FinerBench4BL

Bug localization is an important aspect of software maintenance because ...
research
03/19/2021

Locating Faulty Methods with a Mixed RNN and Attention Model

IR-based fault localization approaches achieves promising results when l...

Please sign up or login with your details

Forgot password? Click here to reset