Towards Automatically Addressing Self-Admitted Technical Debt: How Far Are We?

08/17/2023
by   Antonio Mastropaolo, et al.
0

Upon evolving their software, organizations and individual developers have to spend a substantial effort to pay back technical debt, i.e., the fact that software is released in a shape not as good as it should be, e.g., in terms of functionality, reliability, or maintainability. This paper empirically investigates the extent to which technical debt can be automatically paid back by neural-based generative models, and in particular models exploiting different strategies for pre-training and fine-tuning. We start by extracting a dateset of 5,039 Self-Admitted Technical Debt (SATD) removals from 595 open-source projects. SATD refers to technical debt instances documented (e.g., via code comments) by developers. We use this dataset to experiment with seven different generative deep learning (DL) model configurations. Specifically, we compare transformers pre-trained and fine-tuned with different combinations of training objectives, including the fixing of generic code changes, SATD removals, and SATD-comment prompt tuning. Also, we investigate the applicability in this context of a recently-available Large Language Model (LLM)-based chat bot. Results of our study indicate that the automated repayment of SATD is a challenging task, with the best model we experimented with able to automatically fix  2 number of attempts it is allowed to make. Given the limited size of the fine-tuning dataset ( 5k instances), the model's pre-training plays a fundamental role in boosting performance. Also, the ability to remove SATD steadily drops if the comment documenting the SATD is not provided as input to the model. Finally, we found general-purpose LLMs to not be a competitive approach for addressing SATD.

READ FULL TEXT

page 1

page 9

page 10

research
02/08/2023

Automating Code-Related Tasks Through Transformers: The Impact of Pre-training

Transformers have gained popularity in the software engineering (SE) lit...
research
06/17/2022

Using Transfer Learning for Code-Related Tasks

Deep learning (DL) techniques have been used to support several code-rel...
research
01/28/2019

Wait For It: Identifying "On-Hold" Self-Admitted Technical Debt

Self-admitted technical debt refers to situations where a software devel...
research
05/05/2022

Declaration-based Prompt Tuning for Visual Question Answering

In recent years, the pre-training-then-fine-tuning paradigm has yielded ...
research
02/10/2023

Impact of Code Language Models on Automated Program Repair

Automated program repair (APR) aims to help developers improve software ...
research
10/18/2022

Alibaba-Translate China's Submission for WMT 2022 Metrics Shared Task

In this report, we present our submission to the WMT 2022 Metrics Shared...
research
08/16/2023

Boosting Commit Classification with Contrastive Learning

Commit Classification (CC) is an important task in software maintenance,...

Please sign up or login with your details

Forgot password? Click here to reset