Exploring Representation of Horn Clauses using GNNs (technique report)

06/14/2022
by   Chencheng Liang, et al.
0

Learning program semantics from raw source code is challenging due to the complexity of real-world programming language syntax and due to the difficulty of reconstructing long-distance relational information implicitly represented in programs using identifiers. Addressing the first point, we consider Constrained Horn Clauses (CHCs) as a standard representation of program verification problems, providing a simple and programming language-independent syntax. For the second challenge, we explore graph representations of CHCs, and propose a new Relational Hypergraph Neural Network (R-HyGNN) architecture to learn program features. We introduce two different graph representations of CHCs. One is called constraint graph (CG), and emphasizes syntactic information of CHCs by translating the symbols and their relations in CHCs as typed nodes and binary edges, respectively, and constructing the constraints as abstract syntax trees. The second one is called control- and data-flow hypergraph (CDHG), and emphasizes semantic information of CHCs by representing the control and data flow through ternary hyperedges. We then propose a new GNN architecture, R-HyGNN, extending Relational Graph Convolutional Networks, to handle hypergraphs. To evaluate the ability of R-HyGNN to extract semantic information from programs, we use R-HyGNNs to train models on the two graph representations, and on five proxy tasks with increasing difficulty, using benchmarks from CHC-COMP 2021 as training data. The most difficult proxy task requires the model to predict the occurrence of clauses in counter-examples, which subsumes satisfiability of CHCs. CDHG achieves 90.59 task. Furthermore, R-HyGNN has perfect predictions on one of the graphs consisting of more than 290 clauses. Overall, our experiments indicate that R-HyGNN can capture intricate program features for guiding verification problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2022

Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

Program representation, which aims at converting program source code int...
research
12/08/2020

Learning to Represent Programs with Heterogeneous Graphs

Program source code contains complex structure information, which can be...
research
05/07/2023

Heterogeneous Directed Hypergraph Neural Network over abstract syntax tree (AST) for Code Classification

Code classification is a difficult issue in program understanding and au...
research
02/20/2020

Detecting Code Clones with Graph Neural Networkand Flow-Augmented Abstract Syntax Tree

Code clones are semantically similar code fragments pairs that are synta...
research
09/04/2023

Code Representation Pre-training with Complements from Program Executions

Large language models (LLMs) for natural language processing have been g...
research
01/24/2020

Comparison of Syntactic and Semantic Representations of Programs in Neural Embeddings

Neural approaches to program synthesis and understanding have proliferat...
research
01/28/2022

HEAT: Hyperedge Attention Networks

Learning from structured data is a core machine learning task. Commonly,...

Please sign up or login with your details

Forgot password? Click here to reset