A Topic Guided Pointer-Generator Model for Generating Natural Language Code Summaries

07/04/2021
by   Xin Wang, et al.
0

Code summarization is the task of generating natural language description of source code, which is important for program understanding and maintenance. Existing approaches treat the task as a machine translation problem (e.g., from Java to English) and applied Neural Machine Translation models to solve the problem. These approaches only consider a given code unit (e.g., a method) without its broader context. The lacking of context may hinder the NMT model from gathering sufficient information for code summarization. Furthermore, existing approaches use a fixed vocabulary and do not fully consider the words in code, while many words in the code summary may come from the code. In this work, we present a neural network model named ToPNN for code summarization, which uses the topics in a broader context (e.g., class) to guide the neural networks that combine the generation of new words and the copy of existing words in code. Based on the model we present an approach for generating natural language code summaries at the method level (i.e., method comments). We evaluate our approach using a dataset with 4,203,565 commented Java methods. The results show significant improvement over state-of-the-art approaches and confirm the positive effect of class topics and the copy mechanism.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2019

A Neural Model for Generating Natural Language Summaries of Program Subroutines

Source code summarization -- creating natural language descriptions of s...
research
04/06/2020

Improved Code Summarization via a Graph Neural Network

Automatic source code summarization is the task of generating natural la...
research
05/16/2023

Towards Modeling Human Attention from Eye Movements for Neural Source Code Summarization

Neural source code summarization is the task of generating natural langu...
research
11/18/2022

A Copy Mechanism for Handling Knowledge Base Elements in SPARQL Neural Machine Translation

Neural Machine Translation (NMT) models from English to SPARQL are a pro...
research
08/29/2018

Mapping Language to Code in Programmatic Context

Source code is rarely written in isolation. It depends significantly on ...
research
03/31/2020

DeepSumm – Deep Code Summaries using Neural Transformer Architecture

Source code summarizing is a task of writing short, natural language des...
research
02/09/2016

A Convolutional Attention Network for Extreme Summarization of Source Code

Attention mechanisms in neural networks have proved useful for problems ...

Please sign up or login with your details

Forgot password? Click here to reset