StructCoder: Structure-Aware Transformer for Code Generation

06/10/2022
by   Sindhu Tipirneni, et al.
0

There has been a recent surge of interest in automating software engineering tasks using deep learning. This work addresses the problem of code generation where the goal is to generate target code given source code in a different language or a natural language description. Most of the state-of-the-art deep learning models for code generation use training strategies that are primarily designed for natural language. However, understanding and generating code requires a more rigorous comprehension of the code syntax and semantics. With this motivation, we develop an encoder-decoder Transformer model where both the encoder and decoder are trained to recognize the syntax and data flow in the source and target codes, respectively. We not only make the encoder structure-aware by leveraging the source code's syntax tree and data flow graph, but we also ensure that our decoder preserves the syntax and data flow of the target code by introducing two auxiliary tasks: AST (Abstract Syntax Tree) paths prediction and data flow prediction. To the best of our knowledge, this is the first work to introduce a structure-aware Transformer decoder to enhance the quality of generated code by modeling target syntax and data flow. The proposed StructCoder model achieves state-of-the-art performance on code translation and text-to-code generation tasks in the CodeXGLUE benchmark.

READ FULL TEXT
research
04/26/2022

GypSum: Learning Hybrid Representations for Code Summarization

Code summarization with deep learning has been widely studied in recent ...
research
06/28/2019

A Neural-based Program Decompiler

Reverse engineering of binary executables is a critical problem in the c...
research
05/06/2020

TAG : Type Auxiliary Guiding for Code Comment Generation

Existing leading code comment generation approaches with the structure-t...
research
04/25/2017

Abstract Syntax Networks for Code Generation and Semantic Parsing

Tasks like code generation and semantic parsing require mapping unstruct...
research
08/01/2019

Tree-Transformer: A Transformer-Based Method for Correction of Tree-Structured Data

Many common sequential data sources, such as source code and natural lan...
research
07/06/2020

Relevance Transformer: Generating Concise Code Snippets with Relevance Feedback

Tools capable of automatic code generation have the potential to augment...
research
04/06/2023

A Unified Active Learning Framework for Annotating Graph Data with Application to Software Source Code Performance Prediction

Most machine learning and data analytics applications, including perform...

Please sign up or login with your details

Forgot password? Click here to reset