AI Assistants: A Framework for Semi-Automated Data Wrangling

10/31/2022
by   Tomas Petricek, et al.
0

Data wrangling tasks such as obtaining and linking data from various sources, transforming data formats, and correcting erroneous records, can constitute up to 80 and artificial intelligence, data wrangling remains a tedious and manual task. We introduce AI assistants, a class of semi-automatic interactive tools to streamline data wrangling. An AI assistant guides the analyst through a specific data wrangling task by recommending a suitable data transformation that respects the constraints obtained through interaction with the analyst. We formally define the structure of AI assistants and describe how existing tools that treat data cleaning as an optimization problem fit the definition. We implement AI assistants for four common data wrangling tasks and make AI assistants easily accessible to data analysts in an open-source notebook environment for data science, by leveraging the common structure they follow. We evaluate our AI assistants both quantitatively and qualitatively through three example scenarios. We show that the unified and interactive design makes it easy to perform tasks that would be difficult to do manually or with a fully automatic tool.

READ FULL TEXT
research
05/01/2017

A System for Accessible Artificial Intelligence

While artificial intelligence (AI) has become widespread, many commercia...
research
07/01/2022

FAIR principles for AI models, with a practical application for accelerated high energy diffraction microscopy

A concise and measurable set of FAIR (Findable, Accessible, Interoperabl...
research
02/16/2021

VIEW: a framework for organization level interactive record linkage to support reproducible data science

Objective: To design and evaluate a general framework for interactive re...
research
03/17/2022

SemTUI: a Framework for the Interactive Semantic Enrichment of Tabular Data

The large availability of datasets fosters the use of ml and ai technolo...
research
09/12/2023

Commands as AI Conversations

Developers and data scientists often struggle to write command-line inpu...
research
07/20/2022

Automated Kantian Ethics: A Faithful Implementation

As we grant artificial intelligence increasing power and independence in...
research
03/21/2019

Towards Standardization of Data Licenses: The Montreal Data License

This paper provides a taxonomy for the licensing of data in the fields o...

Please sign up or login with your details

Forgot password? Click here to reset