CodeSum: Translate Program Language to Natural Language

08/06/2017
by   Xing Hu, et al.
0

During software maintenance, programmers spend a lot of time on code comprehension. Reading comments is an effective way for programmers to reduce the reading and navigating time when comprehending source code. Therefore, as a critical task in software engineering, code summarization aims to generate brief natural language descriptions for source code. In this paper, we propose a new code summarization model named CodeSum. CodeSum exploits the attention-based sequence-to-sequence (Seq2Seq) neural network with Structure-based Traversal (SBT) of Abstract Syntax Trees (AST). The AST sequences generated by SBT can better present the structure of ASTs and keep unambiguous. We conduct experiments on three large-scale corpora in different program languages, i.e., Java, C#, and SQL, in which Java corpus is our new proposed industry code extracted from Github. Experimental results show that our method CodeSum outperforms the state-of-the-art significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2019

Recommendations for Datasets for Source Code Summarization

Source Code Summarization is the task of writing short, natural language...
research
08/17/2022

ASTRO: An AST-Assisted Approach for Generalizable Neural Clone Detection

Neural clone detection has attracted the attention of software engineeri...
research
01/27/2019

CRAQL: A Composable Language for Querying Source Code

This paper describes the design and implementation of CRAQL (Composable ...
research
03/18/2022

M2TS: Multi-Scale Multi-Modal Approach Based on Transformer for Source Code Summarization

Source code summarization aims to generate natural language descriptions...
research
03/31/2020

DeepSumm – Deep Code Summaries using Neural Transformer Architecture

Source code summarizing is a task of writing short, natural language des...
research
08/28/2020

CORAL: COde RepresentAtion Learning with Weakly-Supervised Transformers for Analyzing Data Analysis

Large scale analysis of source code, and in particular scientific source...
research
08/30/2021

CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

Code summarization aims to generate concise natural language description...

Please sign up or login with your details

Forgot password? Click here to reset