Orienting, Framing, Bridging, Magic, and Counseling: How Data Scientists Navigate the Outer Loop of Client Collaborations in Industry and Academia

by   Sean Kross, et al.

Data scientists often collaborate with clients to analyze data to meet a client's needs. What does the end-to-end workflow of a data scientist's collaboration with clients look like throughout the lifetime of a project? To investigate this question, we interviewed ten data scientists (5 female, 4 male, 1 non-binary) in diverse roles across industry and academia. We discovered that they work with clients in a six-stage outer-loop workflow, which involves 1) laying groundwork by building trust before a project begins, 2) orienting to the constraints of the client's environment, 3) collaboratively framing the problem, 4) bridging the gap between data science and domain expertise, 5) the inner loop of technical data analysis work, 6) counseling to help clients emotionally cope with analysis results. This novel outer-loop workflow contributes to CSCW by expanding the notion of what collaboration means in data science beyond the widely-known inner-loop technical workflow stages of acquiring, cleaning, analyzing, modeling, and visualizing data. We conclude by discussing the implications of our findings for data science education, parallels to design work, and unmet needs for tool development.


page 1

page 2

page 3

page 4


The Content of Statistics and Data Science Collaborations: the QQQ Framework

For today's applied statisticians and data scientists, collaboration is ...

client2vec: Towards Systematic Baselines for Banking Applications

The workflow of data scientists normally involves potentially inefficien...

PIXLISE-C: Exploring The Data Analysis Needs of NASA Scientists for Mineral Identification

NASA JPL scientists working on the micro x-ray fluorescence (microXRF) s...

Facilitating team-based data science: lessons learned from the DSC-WAV project

While coursework provides undergraduate data science students with some ...

A Static Analysis Framework for Data Science Notebooks

Notebooks provide an interactive environment for programmers to develop ...

Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the Loop

AutoML systems can speed up routine data science work and make machine l...

Towards Human Centered AutoML

Building models from data is an integral part of the majority of data sc...