Improved Code Summarization via a Graph Neural Network

04/06/2020
by   Alexander LeClair, et al.
0

Automatic source code summarization is the task of generating natural language descriptions for source code. Automatic code summarization is a rapidly expanding research area, especially as the community has taken greater advantage of advances in neural network and AI technologies. In general, source code summarization techniques use the source code as input and outputs a natural language description. Yet a strong consensus is developing that using structural information as input leads to improved performance. The first approaches to use structural information flattened the AST into a sequence. Recently, more complex approaches based on random AST paths or graph neural networks have improved on the models using flattened ASTs. However, the literature still does not describe the using a graph neural network together with source code sequence as separate inputs to a model. Therefore, in this paper, we present an approach that uses a graph-based neural architecture that better matches the default structure of the AST to generate these summaries. We evaluate our technique using a data set of 2.1 million Java method-comment pairs and show improvement over four baseline techniques, two from the software engineering literature, and two from machine learning literature.

READ FULL TEXT

page 8

page 9

page 10

research
02/05/2019

A Neural Model for Generating Natural Language Summaries of Program Subroutines

Source code summarization -- creating natural language descriptions of s...
research
07/05/2021

CoCoSum: Contextual Code Summarization with Multi-Relational Graph Neural Network

Source code summaries are short natural language descriptions of code sn...
research
07/04/2021

A Topic Guided Pointer-Generator Model for Generating Natural Language Code Summaries

Code summarization is the task of generating natural language descriptio...
research
03/28/2023

Label Smoothing Improves Neural Source Code Summarization

Label smoothing is a regularization technique for neural networks. Norma...
research
04/10/2020

Improved Automatic Summarization of Subroutines via Attention to File Context

Software documentation largely consists of short, natural language summa...
research
03/31/2021

HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural Network for Code Documentation Generation in Jupyter Notebooks

Many data scientists use Jupyter notebook to experiment code, visualize ...
research
03/31/2020

DeepSumm – Deep Code Summaries using Neural Transformer Architecture

Source code summarizing is a task of writing short, natural language des...

Please sign up or login with your details

Forgot password? Click here to reset