Modeling Programs Hierarchically with Stack-Augmented LSTM

by   Fang Liu, et al.

Programming language modeling has attracted extensive attention in recent years, and it plays an essential role in program processing fields. Statistical language models, which are initially designed for natural languages, have been generally used for modeling programming languages. However, different from natural languages, programming languages contain explicit and hierarchical structure that is hard to learn by traditional statistical language models. To address this challenge, we propose a novel Stack-Augmented LSTM neural network for programming language modeling. Adding a stack memory component into the LSTM network enables our model to capture the hierarchical information of programs through the PUSH and POP operations, which further allows our model capturing the long-term dependency in the programs. We evaluate the proposed model on three program analysis tasks, i.e., code completion, program classification, and code summarization. Evaluation results show that our proposed model outperforms baseline models in all the three tasks, indicating that by capturing the structural information of programs with a stack, our proposed model can represent programs more precisely.


page 1

page 2

page 3

page 4


On the Applicability of Language Models to Block-Based Programs

Block-based programming languages like Scratch are increasingly popular ...

Convolutional Neural Networks over Tree Structures for Programming Language Processing

Programming language processing (similar to natural language processing)...

Solving the Funarg Problem with Static Types

The difficulty associated with storing closures in a stack-based environ...

Autoencoders as Tools for Program Synthesis

Recently there have been many advances in research on language modeling ...

A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning

Code completion, one of the most useful features in the integrated devel...

Coder Reviewer Reranking for Code Generation

Sampling diverse programs from a code language model and reranking with ...

A General Path-Based Representation for Predicting Program Properties

Predicting program properties such as names or expression types has a wi...

Please sign up or login with your details

Forgot password? Click here to reset