AST-MHSA : Code Summarization using Multi-Head Self-Attention

08/10/2023
by   Yeshwanth Nagaraj, et al.
0

Code summarization aims to generate concise natural language descriptions for source code. The prevailing approaches adopt transformer-based encoder-decoder architectures, where the Abstract Syntax Tree (AST) of the source code is utilized for encoding structural information. However, ASTs are much longer than the corresponding source code, and existing methods ignore this size constraint by directly feeding the entire linearized AST into the encoders. This simplistic approach makes it challenging to extract truly valuable dependency relations from the overlong input sequence and leads to significant computational overhead due to self-attention applied to all nodes in the AST. To address this issue effectively and efficiently, we present a model, AST-MHSA that uses multi-head attention to extract the important semantic information from the AST. The model consists of two main components: an encoder and a decoder. The encoder takes as input the abstract syntax tree (AST) of the code and generates a sequence of hidden states. The decoder then takes these hidden states as input and generates a natural language summary of the code. The multi-head attention mechanism allows the model to learn different representations of the input code, which can be combined to generate a more comprehensive summary. The model is trained on a dataset of code and summaries, and the parameters of the model are optimized to minimize the loss between the generated summaries and the ground-truth summaries.

READ FULL TEXT
research
12/02/2021

AST-Transformer: Encoding Abstract Syntax Trees Efficiently for Code Summarization

Code summarization aims to generate brief natural language descriptions ...
research
04/26/2022

GypSum: Learning Hybrid Representations for Code Summarization

Code summarization with deep learning has been widely studied in recent ...
research
11/17/2018

Improving Automatic Source Code Summarization via Deep Reinforcement Learning

Code summarization provides a high level natural language description of...
research
12/01/2021

Graph Conditioned Sparse-Attention for Improved Source Code Understanding

Transformer architectures have been successfully used in learning source...
research
03/14/2021

Improving Code Summarization with Block-wise Abstract Syntax Tree Splitting

Automatic code summarization frees software developers from the heavy bu...
research
02/09/2016

A Convolutional Attention Network for Extreme Summarization of Source Code

Attention mechanisms in neural networks have proved useful for problems ...
research
06/05/2019

Generating Multi-Sentence Abstractive Summaries of Interleaved Texts

In multi-participant postings, as in online chat conversations, several ...

Please sign up or login with your details

Forgot password? Click here to reset