Project-Level Encoding for Neural Source Code Summarization of Subroutines

03/22/2021
by   Aakash Bansal, et al.
0

Source code summarization of a subroutine is the task of writing a short, natural language description of that subroutine. The description usually serves in documentation aimed at programmers, where even brief phrase (e.g. "compresses data to a zip file") can help readers rapidly comprehend what a subroutine does without resorting to reading the code itself. Techniques based on neural networks (and encoder-decoder model designs in particular) have established themselves as the state-of-the-art. Yet a problem widely recognized with these models is that they assume the information needed to create a summary is present within the code being summarized itself - an assumption which is at odds with program comprehension literature. Thus a current research frontier lies in the question of encoding source code context into neural models of summarization. In this paper, we present a project-level encoder to improve models of code summarization. By project-level, we mean that we create a vectorized representation of selected code files in a software project, and use that representation to augment the encoder of state-of-the-art neural code summarization techniques. We demonstrate how our encoder improves several existing models, and provide guidelines for maximizing improvement while controlling time and resource costs in model size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2023

Revisiting File Context for Source Code Summarization

Source code summarization is the task of writing natural language descri...
research
07/21/2023

Statement-based Memory for Neural Source Code Summarization

Source code summarization is the task of writing natural language descri...
research
07/23/2021

Ensemble Models for Neural Source Code Summarization of Subroutines

A source code summary of a subroutine is a brief description of that sub...
research
10/21/2022

Low-Resources Project-Specific Code Summarization

Code summarization generates brief natural language descriptions of sour...
research
04/26/2022

GypSum: Learning Hybrid Representations for Code Summarization

Code summarization with deep learning has been widely studied in recent ...
research
08/28/2023

Distilled GPT for Source Code Summarization

A code summary is a brief natural language description of source code. S...
research
03/28/2023

Label Smoothing Improves Neural Source Code Summarization

Label smoothing is a regularization technique for neural networks. Norma...

Please sign up or login with your details

Forgot password? Click here to reset