DeepAI AI Chat
Log In Sign Up

Convolutional Neural Networks over Tree Structures for Programming Language Processing

by   Lili Mou, et al.
Peking University
Stanford University

Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.


page 1

page 2

page 3

page 4


TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing

Program comprehension is a fundamental task in software development and ...

AST-Based Deep Learning for Detecting Malicious PowerShell

With the celebrated success of deep learning, some attempts to develop e...

Semantic Query Language for Temporal Genealogical Trees

Computers play a crucial role in modern ancestry management, they are us...

Implementing G-Machine in HyperLMNtal

Since language processing systems generally allocate/discard memory with...

Modeling Programs Hierarchically with Stack-Augmented LSTM

Programming language modeling has attracted extensive attention in recen...

Fex: Assisted Identification of Domain Features from C Programs

Modern software typically performs more than one functionality. These fu...

A Grammar-Based Structural CNN Decoder for Code Generation

Code generation maps a program description to executable source code in ...