GypSum: Learning Hybrid Representations for Code Summarization

04/26/2022
by   Yu Wang, et al.
0

Code summarization with deep learning has been widely studied in recent years. Current deep learning models for code summarization generally follow the principle in neural machine translation and adopt the encoder-decoder framework, where the encoder learns the semantic representations from source code and the decoder transforms the learnt representations into human-readable text that describes the functionality of code snippets. Despite they achieve the new state-of-the-art performance, we notice that current models often either generate less fluent summaries, or fail to capture the core functionality, since they usually focus on a single type of code representations. As such we propose GypSum, a new deep learning model that learns hybrid representations using graph attention neural networks and a pre-trained programming and natural language model. We introduce particular edges related to the control flow of a code snippet into the abstract syntax tree for graph construction, and design two encoders to learn from the graph and the token sequence of source code, respectively. We modify the encoder-decoder sublayer in the Transformer's decoder to fuse the representations and propose a dual-copy mechanism to facilitate summary generation. Experimental results demonstrate the superior performance of GypSum over existing code summarization models.

READ FULL TEXT
research
08/10/2023

AST-MHSA : Code Summarization using Multi-Head Self-Attention

Code summarization aims to generate concise natural language description...
research
06/10/2022

StructCoder: Structure-Aware Transformer for Code Generation

There has been a recent surge of interest in automating software enginee...
research
05/06/2020

TAG : Type Auxiliary Guiding for Code Comment Generation

Existing leading code comment generation approaches with the structure-t...
research
11/17/2018

Improving Automatic Source Code Summarization via Deep Reinforcement Learning

Code summarization provides a high level natural language description of...
research
02/08/2023

Leveraging Summary Guidance on Medical Report Summarization

This study presents three deidentified large medical text datasets, name...
research
03/22/2021

Project-Level Encoding for Neural Source Code Summarization of Subroutines

Source code summarization of a subroutine is the task of writing a short...
research
05/18/2023

Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization

Automatically generating human-readable text describing the functionalit...

Please sign up or login with your details

Forgot password? Click here to reset