How Much Automation Does a Data Scientist Want?

by   Dakuo Wang, et al.

Data science and machine learning (DS/ML) are at the heart of the recent advancements of many Artificial Intelligence (AI) applications. There is an active research thread in AI, , that aims to develop systems for automating end-to-end the DS/ML Lifecycle. However, do DS and ML workers really want to automate their DS/ML workflow? To answer this question, we first synthesize a human-centered AutoML framework with 6 User Role/Personas, 10 Stages and 43 Sub-Tasks, 5 Levels of Automation, and 5 Types of Explanation, through reviewing research literature and marketing reports. Secondly, we use the framework to guide the design of an online survey study with 217 DS/ML workers who had varying degrees of experience, and different user roles "matching" to our 6 roles/personas. We found that different user personas participated in distinct stages of the lifecycle – but not all stages. Their desired levels of automation and types of explanation for AutoML also varied significantly depending on the DS/ML stage and the user persona. Based on the survey results, we argue there is no rationale from user needs for complete automation of the end-to-end DS/ML lifecycle. We propose new next steps for user-controlled DS/ML automation.


page 2

page 13

page 14

page 15

page 16

page 17


AutoDS: Towards Human-Centered Automation of Data Science

Data science (DS) projects often follow a lifecycle that consists of lab...

Whither AutoML? Understanding the Role of Automation in Machine Learning Workflows

Efforts to make machine learning more widely accessible have led to a ra...

Imagining Future Digital Assistants at Work: A Study of Task Management Needs

Digital Assistants (DAs) can support workers in the workplace and beyond...

Artificial Intelligence and Machine Learning in 5G Network Security: Opportunities, advantages, and future research trends

Recent technological and architectural advancements in 5G networks have ...

Augmented Data Science: Towards Industrialization and Democratization of Data Science

Conversion of raw data into insights and knowledge requires substantial ...

Propheticus: Generalizable Machine Learning Framework

Due to recent technological developments, Machine Learning (ML), a subfi...

On the Evaluation of Intelligence Process Automation

Intelligent Process Automation (IPA) is emerging as a sub-field of AI to...