A Library for Representing Python Programs as Graphs for Machine Learning

08/15/2022
by   David Bieber, et al.
40

Graph representations of programs are commonly a central element of machine learning for code research. We introduce an open source Python library python_graphs that applies static analysis to construct graph representations of Python programs suitable for training machine learning models. Our library admits the construction of control-flow graphs, data-flow graphs, and composite “program graphs” that combine control-flow, data-flow, syntactic, and lexical information about a program. We present the capabilities and limitations of the library, perform a case study applying the library to millions of competitive programming submissions, and showcase the library's utility for machine learning research.

READ FULL TEXT
research
04/11/2021

AutoGL: A Library for Automated Graph Learning

Recent years have witnessed an upsurge of research interests and applica...
research
11/03/2017

SPARK: Static Program Analysis Reasoning and Retrieving Knowledge

Program analysis is a technique to reason about programs without executi...
research
02/02/2023

mlpack 4: a fast, header-only C++ machine learning library

For over 15 years, the mlpack machine learning library has served as a "...
research
05/10/2023

Scalable Demand-Driven Call Graph Generation for Python

Call graph generation is the foundation of inter-procedural static analy...
research
12/08/2022

babble: Learning Better Abstractions with E-Graphs and Anti-Unification

Library learning compresses a given corpus of programs by extracting com...
research
03/12/2020

Learning distributed representations of graphs with Geo2DR

We present Geo2DR, a Python library for unsupervised learning on graph-s...
research
02/20/2023

pykanto: a python library to accelerate research on wild bird song

Studying the vocalisations of wild animals can be a challenge due to the...

Please sign up or login with your details

Forgot password? Click here to reset