A Static Analysis Framework for Data Science Notebooks

10/15/2021
by   Pavle Subotić, et al.
0

Notebooks provide an interactive environment for programmers to develop code, analyse data and inject interleaved visualizations in a single environment. Despite their flexibility, a major pitfall that data scientists encounter is unexpected behaviour caused by the unique out-of-order execution model of notebooks. As a result, data scientists face various challenges ranging from notebook correctness, reproducibility and cleaning. In this paper, we propose a framework that performs static analysis on notebooks, incorporating their unique execution semantics. Our framework is general in the sense that it accommodate for a wide range of analyses, useful for various notebook use cases. We have instantiated our framework on a diverse set of analyses and have evaluated them on 2211 real world notebooks. Our evaluation demonstrates that the vast majority (98.7 well within the time frame required by interactive notebook clients

READ FULL TEXT

page 9

page 10

research
04/26/2019

Evaluating the Success of a Data Analysis

A fundamental problem in the practice and teaching of data science is ho...
research
08/04/2017

VisAR: Bringing Interactivity to Static Data Visualizations through Augmented Reality

Static visualizations have analytic and expressive value. However, many ...
research
03/08/2022

Model Positionality and Computational Reflexivity: Promoting Reflexivity in Data Science

Data science and machine learning provide indispensable techniques for u...
research
12/19/2022

Natural Language to Code Generation in Interactive Data Science Notebooks

Computational notebooks, such as Jupyter notebooks, are interactive comp...
research
08/03/2018

DataDeps.jl: Repeatable Data Setup for Replicable Data Science

We present DataDeps.jl: a julia package for the reproducible handling of...
research
03/15/2018

Sharing and Preserving Computational Analyses for Posterity with encapsulator

Open data and open source software have been proposed as the primary sol...

Please sign up or login with your details

Forgot password? Click here to reset