A Transformer-based Approach for Source Code Summarization

05/01/2020
by   Wasi Uddin Ahmad, et al.
6

Generating a readable summary that describes the functionality of a program is known as source code summarization. In this task, learning code representation by modeling the pairwise relationship between code tokens to capture their long-range dependencies is crucial. To learn code representation for summarization, we explore the Transformer model that uses a self-attention mechanism and has shown to be effective in capturing long-range dependencies. In this work, we show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin. We perform extensive analysis and ablation studies that reveal several important findings, e.g., the absolute encoding of source code tokens' position hinders, while relative encoding significantly improves the summarization performance. We have made our code publicly available to facilitate future research.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

02/14/2022

Source Code Summarization with Structural Relative Position Guided Transformer

Source code summarization aims at generating concise and clear natural l...
02/09/2016

A Convolutional Attention Network for Extreme Summarization of Source Code

Attention mechanisms in neural networks have proved useful for problems ...
11/17/2021

GN-Transformer: Fusing Sequence and Graph Representation for Improved Code Summarization

As opposed to natural languages, source code understanding is influenced...
12/01/2021

Graph Conditioned Sparse-Attention for Improved Source Code Understanding

Transformer architectures have been successfully used in learning source...
04/19/2021

Code Structure Guided Transformer for Source Code Summarization

Source code summarization aims at generating concise descriptions of giv...
12/29/2020

SIT3: Code Summarization with Structure-Induced Transformer

Code summarization (CS) is becoming a promising area in recent natural l...
02/09/2021

Demystifying Code Summarization Models

The last decade has witnessed a rapid advance in machine learning models...

Code Repositories

NeuralCodeSum

Official implementation of our work, A Transformer-based Approach for Source Code Summarization [ACL 2020].


view repo

CoDesc

A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.


view repo

CoDesc

A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.


view repo

source-code-summarization

Transformer-based approaches for an efficient docstrings generation on a piece of Python's code.


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.