TreeGen: A Tree-Based Transformer Architecture for Code Generation

11/22/2019
by   Zeyu Sun, et al.
0

A code generation system generates programming language code based on an input natural language description. State-of-the-art approaches rely on neural networks for code generation. However, these code generators suffer from two problems. One is the long dependency problem, where a code element often depends on another far-away code element. A variable reference, for example, depends on its definition, which may appear quite a few lines before. The other problem is structure modeling, as programs contain rich structural information. In this paper, we propose a novel tree-based neural architecture, TreeGen, for code generation. TreeGen uses the attention mechanism of Transformers to alleviate the long-dependency problem, and introduces a novel AST reader (encoder) to incorporate grammar rules and AST structures into the network. We evaluated TreeGen on a Python benchmark, HearthStone, and two semantic parsing benchmarks, ATIS and GEO. TreeGen outperformed the previous state-of-the-art approach by 4.5 percentage points on HearthStone, and achieved the best accuracy among neural network-based approaches on ATIS (89.1 We also conducted an ablation test to better understand each component of our model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/14/2018

A Grammar-Based Structural CNN Decoder for Code Generation

Code generation maps a program description to executable source code in ...
research
04/06/2017

A Syntactic Neural Model for General-Purpose Code Generation

We consider the problem of parsing natural language descriptions into so...
research
11/08/2019

Graph-to-Graph Transformer for Transition-based Dependency Parsing

Transition-based dependency parsing is a challenging task for conditioni...
research
03/30/2020

Code Prediction by Feeding Trees to Transformers

In this paper, we describe how to leverage Transformer, a recent neural ...
research
08/01/2019

Tree-Transformer: A Transformer-Based Method for Correction of Tree-Structured Data

Many common sequential data sources, such as source code and natural lan...
research
06/02/2021

Solving Arithmetic Word Problems with Transformers and Preprocessing of Problem Text

This paper outlines the use of Transformer networks trained to translate...
research
08/12/2020

OCoR: An Overlapping-Aware Code Retriever

Code retrieval helps developers reuse the code snippet in the open-sourc...

Please sign up or login with your details

Forgot password? Click here to reset