Automatic Code Summarization via Multi-dimensional Semantic Fusing in GNN

06/09/2020
by   Shangqing Liu, et al.
0

Source code summarization aims to generate natural language summaries from structured code snippets for better understanding code functionalities. Recent works attempt to encode programs into graphs for learning program semantics and yield promising results. However, these methods only use simple code representations(e.g., AST), which limits the capability of learning the rich semantics for complex programs. Furthermore, these models primarily rely on graph-based message passing, which only captures local neighborhood relations. To this end, in this paper, we combine diverse representations of the source code (i.e., AST, CFG and PDG)into a joint code property graph. To better learn semantics from the joint graph, we propose a retrieval-augmented mechanism to augment source code semantics with external knowledge. Furthermore, we propose a novel attention-based dynamic graph to capture global interactions among nodes in the static graph and followed a hybrid message passing GNN to incorporate both static and dynamic graph. To evaluate our proposed approach, we release a new challenging benchmark, crawledfrom200+diversified large-scale open-source C/C++projects. Our method achieves the state-of-the-art performance, improving existing methods by1.66,2.38and2.22in terms of BLEU-4, ROUGE-L and METEOR metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/29/2020

SIT3: Code Summarization with Structure-Induced Transformer

Code summarization (CS) is becoming a promising area in recent natural l...
research
06/17/2019

Learning Execution through Neural Code Fusion

As the performance of computer systems stagnates due to the end of Moore...
research
05/22/2018

Generative Code Modeling with Graphs

Generative models for source code are an interesting structured predicti...
research
05/18/2020

Learning Semantic Program Embeddings with GraphInterval Neural Network

Learning distributed representations of source code has been a challengi...
research
05/18/2020

Learning Semantic Program Embeddings with Graph Interval Neural Network

Learning distributed representations of source code has been a challengi...
research
01/28/2022

HEAT: Hyperedge Attention Networks

Learning from structured data is a core machine learning task. Commonly,...
research
01/28/2022

Compositionality-Aware Graph2Seq Learning

Graphs are a highly expressive data structure, but it is often difficult...

Please sign up or login with your details

Forgot password? Click here to reset