Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization

05/18/2023
by   Tong Ye, et al.
0

Automatically generating human-readable text describing the functionality of a program is the intent of source code summarization. Although Neural Language Models achieve significant performance in this field, an emerging trend is combining neural models with external knowledge. Most previous approaches rely on the sentence-level retrieval and combination paradigm (retrieval of similar code snippets and use of the corresponding code and summary pairs) on the encoder side. However, this paradigm is coarse-grained and cannot directly take advantage of the high-quality retrieved summary tokens on the decoder side. In this paper, we explore a fine-grained token-level retrieval-augmented mechanism on the decoder side to help the vanilla neural model generate a better code summary. Furthermore, to mitigate the limitation of token-level retrieval on capturing contextual code semantics, we propose to integrate code semantics into summary tokens. Extensive experiments and human evaluation reveal that our token-level retrieval-augmented approach significantly improves performance and is more interpretive.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/01/2020

A Transformer-based Approach for Source Code Summarization

Generating a readable summary that describes the functionality of a prog...
research
07/23/2021

Ensemble Models for Neural Source Code Summarization of Subroutines

A source code summary of a subroutine is a brief description of that sub...
research
02/09/2016

A Convolutional Attention Network for Extreme Summarization of Source Code

Attention mechanisms in neural networks have proved useful for problems ...
research
10/15/2020

Understanding Neural Abstractive Summarization Models via Uncertainty

An advantage of seq2seq abstractive summarization models is that they ge...
research
04/26/2022

GypSum: Learning Hybrid Representations for Code Summarization

Code summarization with deep learning has been widely studied in recent ...
research
03/20/2021

Keywords Guided Method Name Generation

High quality method names are descriptive and readable, which are helpfu...
research
09/19/2022

MMF3: Neural Code Summarization Based on Multi-Modal Fine-Grained Feature Fusion

Background: Code summarization automatically generates the corresponding...

Please sign up or login with your details

Forgot password? Click here to reset