EALink: An Efficient and Accurate Pre-trained Framework for Issue-Commit Link Recovery

08/21/2023
by   Chenyuan Zhang, et al.
0

Issue-commit links, as a type of software traceability links, play a vital role in various software development and maintenance tasks. However, they are typically deficient, as developers often forget or fail to create tags when making commits. Existing studies have deployed deep learning techniques, including pretrained models, to improve automatic issue-commit link recovery.Despite their promising performance, we argue that previous approaches have four main problems, hindering them from recovering links in large software projects. To overcome these problems, we propose an efficient and accurate pre-trained framework called EALink for issue-commit link recovery. EALink requires much fewer model parameters than existing pre-trained methods, bringing efficient training and recovery. Moreover, we design various techniques to improve the recovery accuracy of EALink. We construct a large-scale dataset and conduct extensive experiments to demonstrate the power of EALink. Results show that EALink outperforms the state-of-the-art methods by a large margin (15.23 training and inference overhead is orders of magnitude lower than existing methods.

READ FULL TEXT
research
11/01/2022

LinkFormer: Automatic Contextualised Link Recovery of Software Artifacts in both Project-based and Transfer Learning Settings

Software artifacts often interact with each other throughout the softwar...
research
08/10/2021

Issue Link Label Recovery and Prediction for Open Source Software

Modern open source software development heavily relies on the issue trac...
research
01/29/2023

Boosting Automated Patch Correctness Prediction via Pre-trained Language Model

Automated program repair (APR) aims to fix software bugs automatically w...
research
05/18/2023

CCT5: A Code-Change-Oriented Pre-Trained Model

Software is constantly changing, requiring developers to perform several...
research
07/05/2021

Automated Recovery of Issue-Commit Links Leveraging Both Textual and Non-textual Data

An issue documents discussions around required changes in issue-tracking...
research
12/02/2018

Link Delay Estimation Using Sparse Recovery for Dynamic Network Tomography

When the scale of communication networks has been growing rapidly in the...
research
05/18/2020

Improving the Effectiveness of Traceability Link Recovery using Hierarchical Bayesian Networks

Traceability is a fundamental component of the modern software developme...

Please sign up or login with your details

Forgot password? Click here to reset