EDAssistant: Supporting Exploratory Data Analysis in Computational Notebooks with In-Situ Code Search and Recommendation

12/15/2021
by   Xingjun Li, et al.
6

Using computational notebooks (e.g., Jupyter Notebook), data scientists rationalize their exploratory data analysis (EDA) based on their prior experience and external knowledge such as online examples. For novices or data scientists who lack specific knowledge about the dataset or problem to investigate, effectively obtaining and understanding the external information is critical to carry out EDA. This paper presents EDAssistant, a JupyterLab extension that supports EDA with in-situ search of example notebooks and recommendation of useful APIs, powered by novel interactive visualization of search results. The code search and recommendation are enabled by state-of-the-art machine learning models, trained on a large corpus of EDA notebooks collected online. A user study is conducted to investigate both EDAssistant and data scientists' current practice (i.e., using external search engines). The results demonstrate the effectiveness and usefulness of EDAssistant, and participants appreciated its smooth and in-context support of EDA. We also report several design implications regarding code recommendation tools.

READ FULL TEXT
research
01/26/2023

On the Design of AI-powered Code Assistants for Notebooks

AI-powered code assistants, such as Copilot, are quickly becoming a ubiq...
research
08/08/2023

Dead or Alive: Continuous Data Profiling for Interactive Data Science

Profiling data by plotting distributions and analyzing summary statistic...
research
06/13/2018

Towards Semantically Enhanced Data Understanding

In the field of machine learning, data understanding is the practice of ...
research
02/22/2022

StickyLand: Breaking the Linear Presentation of Computational Notebooks

How can we better organize code in computational notebooks? Notebooks ha...
research
01/11/2023

Enhancing Comprehension and Navigation in Jupyter Notebooks with Static Analysis

Jupyter notebooks enable developers to interleave code snippets with ric...
research
10/02/2017

Accelerating Scientific Data Exploration via Visual Query Systems

The increasing availability of rich and complex data in a variety of sci...
research
02/02/2021

NBSearch: Semantic Search and Visual Exploration of Computational Notebooks

Code search is an important and frequent activity for developers using c...

Please sign up or login with your details

Forgot password? Click here to reset