DeepAI AI Chat
Log In Sign Up

CodeSum: Translate Program Language to Natural Language

by   Xing Hu, et al.

During software maintenance, programmers spend a lot of time on code comprehension. Reading comments is an effective way for programmers to reduce the reading and navigating time when comprehending source code. Therefore, as a critical task in software engineering, code summarization aims to generate brief natural language descriptions for source code. In this paper, we propose a new code summarization model named CodeSum. CodeSum exploits the attention-based sequence-to-sequence (Seq2Seq) neural network with Structure-based Traversal (SBT) of Abstract Syntax Trees (AST). The AST sequences generated by SBT can better present the structure of ASTs and keep unambiguous. We conduct experiments on three large-scale corpora in different program languages, i.e., Java, C#, and SQL, in which Java corpus is our new proposed industry code extracted from Github. Experimental results show that our method CodeSum outperforms the state-of-the-art significantly.


page 1

page 2

page 3

page 4


Recommendations for Datasets for Source Code Summarization

Source Code Summarization is the task of writing short, natural language...

ASTRO: An AST-Assisted Approach for Generalizable Neural Clone Detection

Neural clone detection has attracted the attention of software engineeri...

CRAQL: A Composable Language for Querying Source Code

This paper describes the design and implementation of CRAQL (Composable ...

M2TS: Multi-Scale Multi-Modal Approach Based on Transformer for Source Code Summarization

Source code summarization aims to generate natural language descriptions...

DeepSumm – Deep Code Summaries using Neural Transformer Architecture

Source code summarizing is a task of writing short, natural language des...

CORAL: COde RepresentAtion Learning with Weakly-Supervised Transformers for Analyzing Data Analysis

Large scale analysis of source code, and in particular scientific source...

CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

Code summarization aims to generate concise natural language description...