Teaching machines to understand data science code by semantic enrichment of dataflow graphs

07/16/2018
by   Evan Patterson, et al.
2

Your computer is continuously executing programs, but does it really understand them? Not in any meaningful sense. That burden falls upon human knowledge workers, who are increasingly asked to write and understand code. They would benefit greatly from intelligent tools that reveal the connections between their code and its subject matter. Towards this prospect, we develop an AI system that forms semantic representations of computer programs, using techniques from knowledge representation and program analysis. We focus on code written for data science, although our method is more generally applicable. The semantic representations are created through a novel algorithm for the semantic enrichment of dataflow graphs. This algorithm is undergirded by a new ontology language for modeling computer programs and a new ontology about data science, written in this language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2021

Perspective on Data Science

The field of data science currently enjoys a broad definition that inclu...
research
10/07/2022

How Do Data Science Workers Communicate Intermediate Results?

Data science workers increasingly collaborate on large-scale projects be...
research
12/19/2022

Natural Language to Code Generation in Interactive Data Science Notebooks

Computational notebooks, such as Jupyter notebooks, are interactive comp...
research
10/08/2019

Knowledge-based Biomedical Data Science 2019

Knowledge-based biomedical data science (KBDS) involves the design and i...
research
10/28/2022

Code4ML: a Large-scale Dataset of annotated Machine Learning Code

Program code as a data source is gaining popularity in the data science ...
research
04/14/2022

Delivering data differently

Human-computer interaction relies on mouse/touchpad, keyboard, and scree...
research
06/13/2019

A complete language for faceted dataflow programs

We present a complete categorical axiomatization of a wide class of data...

Please sign up or login with your details

Forgot password? Click here to reset