DockerizeMe: Automatic Inference of Environment Dependencies for Python Code Snippets

05/27/2019
by   Eric Horton, et al.
0

Platforms like Stack Overflow and GitHub's gist system promote the sharing of ideas and programming techniques via the distribution of code snippets designed to illustrate particular tasks. Python, a popular and fast-growing programming language, sees heavy use on both sites, with nearly one million questions asked on Stack Overflow and 400 thousand public gists on GitHub. Unfortunately, around 75 directly executed. When run in a clean environment, over 50 gists fail due to an import error for a missing library. We present DockerizeMe, a technique for inferring the dependencies needed to execute a Python code snippet without import error. DockerizeMe starts with offline knowledge acquisition of the resources and dependencies for popular Python packages from the Python Package Index (PyPI). It then builds Docker specifications using a graph-based inference procedure. Our inference procedure resolves import errors in 892 out of nearly 3,000 gists from the Gistable dataset for which Gistable's baseline approach could not find and install all dependencies.

READ FULL TEXT
research
08/14/2018

Gistable: Evaluating the Executability of Python Code Snippets on GitHub

Software developers create and share code online to demonstrate programm...
research
03/06/2019

Security Issues in Language-based Sofware Ecosystems

Language-based ecosystems (LBE), i.e., software ecosystems based on a si...
research
10/26/2017

Fast Linear Transformations in Python

This paper introduces a new free library for the Python programming lang...
research
06/27/2019

Enhancing Python Compiler Error Messages via Stack Overflow

Background: Compilers tend to produce cryptic and uninformative error me...
research
07/10/2019

Executability of Python Snippets in Stack Overflow

Online resources today contain an abundant amount of code snippets for d...
research
11/23/2022

: a Python "smuggler" for constructing lightweight reproducible notebooks

Reproducibility is a core requirement of modern scientific research. For...
research
07/20/2022

Modelling the Turtle Python library in CSP

Software verification is an important tool in establishing the reliabili...

Please sign up or login with your details

Forgot password? Click here to reset