Revisiting File Context for Source Code Summarization

09/05/2023
by   Aakash Bansal, et al.
0

Source code summarization is the task of writing natural language descriptions of source code. A typical use case is generating short summaries of subroutines for use in API documentation. The heart of almost all current research into code summarization is the encoder-decoder neural architecture, and the encoder input is almost always a single subroutine or other short code snippet. The problem with this setup is that the information needed to describe the code is often not present in the code itself – that information often resides in other nearby code. In this paper, we revisit the idea of “file context” for code summarization. File context is the idea of encoding select information from other subroutines in the same file. We propose a novel modification of the Transformer architecture that is purpose-built to encode file context and demonstrate its improvement over several baselines. We find that file context helps on a subset of challenging examples where traditional approaches struggle.

READ FULL TEXT
research
07/21/2023

Statement-based Memory for Neural Source Code Summarization

Source code summarization is the task of writing natural language descri...
research
04/10/2020

Improved Automatic Summarization of Subroutines via Attention to File Context

Software documentation largely consists of short, natural language summa...
research
03/22/2021

Project-Level Encoding for Neural Source Code Summarization of Subroutines

Source code summarization of a subroutine is the task of writing a short...
research
08/29/2018

Mapping Language to Code in Programmatic Context

Source code is rarely written in isolation. It depends significantly on ...
research
08/28/2023

Distilled GPT for Source Code Summarization

A code summary is a brief natural language description of source code. S...
research
07/05/2021

CoCoSum: Contextual Code Summarization with Multi-Relational Graph Neural Network

Source code summaries are short natural language descriptions of code sn...
research
09/17/2021

Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy

Statistical language modeling and translation with transformers have fou...

Please sign up or login with your details

Forgot password? Click here to reset