How do Data Science Workers Collaborate? Roles, Workflows, and Tools

01/18/2020
by   Amy X. Zhang, et al.
24

Today, the prominence of data science within organizations has given rise to teams of data science workers collaborating on extracting insights from data, as opposed to individual data scientists working alone. However, we still lack a deep understanding of how data science workers collaborate in practice. In this work, we conducted an online survey with 183 participants who work in various aspects of data science. We focused on their reported interactions with each other (e.g., managers with engineers) and with different tools (e.g., Jupyter Notebook). We found that data science teams are extremely collaborative and work with a variety of stakeholders and tools during the six common steps of a data science workflow (e.g., clean data and train model). We also found that the collaborative practices workers employ, such as documentation, vary according to the kinds of tools they use. Based on these findings, we discuss design implications for supporting data science team collaborations and future research directions.

READ FULL TEXT

page 7

page 8

page 12

research
10/07/2022

How Do Data Science Workers Communicate Intermediate Results?

Data science workers increasingly collaborate on large-scale projects be...
research
01/12/2021

Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the Loop

AutoML systems can speed up routine data science work and make machine l...
research
09/06/2022

Code Code Evolution: Understanding How People Change Data Science Notebooks Over Time

Sensemaking is the iterative process of identifying, extracting, and exp...
research
03/09/2021

Performing Creativity With Computational Tools

The introduction of new tools in people's workflow has always been promo...
research
09/18/2023

How to Data in Datathons

The rise of datathons, also known as data or data science hackathons, ha...
research
02/09/2022

The craft and coordination of data curation: complicating "workflow" views of data science

Data curation is the process of making a dataset fit-for-use and archive...
research
05/13/2020

Tropical Data Science

Phylogenomics is a new field which applies to tools in phylogenetics to ...

Please sign up or login with your details

Forgot password? Click here to reset