How Data Scientists Work Together With Domain Experts in Scientific Collaborations: To Find The Right Answer Or To Ask The Right Question?

09/08/2019 ∙ by Yaoli Mao, et al. ∙ ibm Columbia University 72

In recent years there has been an increasing trend in which data scientists and domain experts work together to tackle complex scientific questions. However, such collaborations often face challenges. In this paper, we aim to decipher this collaboration complexity through a semi-structured interview study with 22 interviewees from teams of bio-medical scientists collaborating with data scientists. In the analysis, we adopt the Olsons' four-dimensions framework proposed in Distance Matters to code interview transcripts. Our findings suggest that besides the glitches in the collaboration readiness, technology readiness, and coupling of work dimensions, the tensions that exist in the common ground building process influence the collaboration outcomes, and then persist in the actual collaboration process. In contrast to prior works' general account of building a high level of common ground, the breakdowns of content common ground together with the strengthen of process common ground in this process is more beneficial for scientific discovery. We discuss why that is and what the design suggestions are, and conclude the paper with future directions and limitations.



There are no comments yet.


page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Thanks to the advancement of Information Technology and Cloud Computing infrastructure in recent years, a huge amount of data has been generated in the scientific discovery process  (CERN, 2018;, 2019) and is shared more broadly  (Bos et al., 2007). For example, the European Organization for Nuclear Research (CERN) generated 70 petabytes of data from particle physics experiments in their Large Hadron Collider (LHC) in only 2017; and they distributed and processed the data in laboratories around the world (CERN, 2018). GenBank in the Human Genome Project (HGP) released 212,260,377 sequences of human genome data in February 2019 (GenBank, 2019).

Such huge and complex data collection in scientific projects has gone beyond the analytic capability of a local research team in a single expertise domain, and calls for new ways of conducting scientific research. The Open Science movement started in recent years has transformed traditional science research practices to embrace more openness and re-producibility (Woelfle et al., 2011; Peng, 2015). It advocates for transparency and accessibility in knowledge, data, tools, analytic processes, and interdisciplinary collaboration in the scientific discovery process (Vicente-Sáez and Martínez-Fuentes, 2018). Because of the data-centric nature, most open science projects attract data scientists to collaborate with the domain experts. In this paper, we do not make fine-grain distinctions of data workers (Muller et al., 2019), so that we denote all these data experts who often have no prior domain knowledge as ”data scientists”.

Many of these interdisciplinary collaborations have shown promising progress in solving hard scientific problems. For example, Critical Assessment of Structure Prediction (CASP), a biannual competition aimed at predicting the 3D structure of proteins, has attracted tens of thousands of models submitted by approximately 100 research groups worldwide and granted the top winner to a Data Science researcher team – Google’s Deepmind’s AlphaFold (CASP13, 2018). The success of these interdisciplinary collaborations is also appealing to Human-Computer Interaction (HCI) researchers and a few papers have been published in recent years (e.g., offline data hackathon for civic issues (Hou and Wang, 2017), or online data challenges such as in  (Carpenter, 2011)).

However, besides these aforementioned success stories, there are also turbulences in these collaborations. Even in the case study reporting a successful offline data hackathon event, Hou and Wang  (Hou and Wang, 2017) described a tension between the NPOs’ expectations (domain experts) and the data volunteers’ expectations (data scientists), which they described as a ”dual goal” dilemma. In the more general open science and cyberinfrastructure contexts, tensions and challenges are not rarely seen, which have been attributed to the interdisciplinary nature of the team  (Velden, 2013), related motivational factors  (Spencer Jr et al., 2008) and cultural differences  (Birnholtz and Finholt, 2013), the remote and cross-culture team structure  (Lawrence, 2006; Luo et al., 2010), the data-centric practice (Rolland and Lee, 2013), or the lack of technology and infrastructure support  (Olson et al., 2002).

These tensions are not new in the Computer-Supported Cooperative Work (CSCW) field. In their landmark paper, ”Distance Matters”, 20 years ago  (Olson and Olson, 2000) Olson and Olson developed a coherent framework to describe a collaboration to be successful or not. It has four dimensions: Common Ground, Coupling of Work, Collaboration Readiness, and Technology Readiness. Though they were primarily looking at remote, not necessarily data-centric, scientific collaborations at that time (which they referred to collaboratories (Wulf, 1989)), their framework has been proven to be effective in analyzing more general collaborations beyond the ”remote” settings (Olson et al., 2008; Olson and Olson, 2013, 2016; Jirotka et al., 2013; Olson et al., 2017).

In this paper, we continue this line of research on analyzing interdisciplinary collaborations using the Olsons’ framework. We focus on data science projects and we use the bio-medical scientific domain as a case study. Bio-medical research has been one of the most active fields to embrace the open science movement, because bio-medical projects often curate and integrate many and large data sets. Yet, the data-centric projects in this domain also experienced unique challenges, partially because human lives are at stake and mistakes in analyzing data or interpreting results could lead to catastrophic consequences.

We aim to systematically explore the unique challenges that exist in the collaborations between data scientists and bio-medical scientists. Thus, we conducted this semi-structured interview study with 22 data scientists and bio-medical scientists who are involved in various open science collaborations. We have no intention to test the applicability of the Olsons’ framework; rather we use it as an analytic lens to guide our coding of the interview transcripts. Specifically, the research question is: What are the challenges in collaborations between data scientists and domain experts (i.e., bio-medical scientists) in data-centric open science projects, along each of the four dimensions in the Olsons’ framework (Common Ground, Coupling of Work, Technology Readiness and Collaboration Readiness)?

2. Related Work

2.1. The Olsons’ Framework and Remote Scientific Collaborations

Olson and Olson’s framework for remote scientific collaboration  (Olson and Olson, 2000) brings together four major concepts that are critical to successful distributed scientific collaborations. The first concept, the coupling of the work (or the nature of work), refers to the structure and organization of the work. Ambiguous and tightly coupled tasks require higher interdependencies among collaborators and should be modularized to the same location, than the ones in loosely coupled collaborations. The second concept, common ground  (Gray, 1989), refers to how much common knowledge and awareness  (Dourish and Bellotti, 1992) collaborators have about the task and about each other. The third concept, collaboration readiness, refers to those aspects by which collaborators are motivated and willing to collaborate with each other, trust each other, as well as align their goals together. The fourth concept, technology readiness, concerns difficulties in adopting and adapting supporting technologies to fit with collaborators’ current use habits and infrastructures.

Previous HCI studies have used this framework to examine distributed collaborations and to design features to support those collaborations in various fields. One exemplar research work was in the international HIV/AIDS research field  (Olson et al., 2002). This study investigated two collaborations in South Africa with case studies, and found that successful collaborations were subject to limited collaboration readiness, imbalanced technology readiness to adopt and learn advanced tools across different geographic locations, as well as inadequate bandwidth and unstable network of infrastructure.

A more recent study re-examined this framework in globally-distributed software development teams  (Bjørn et al., 2014). They examined four ethnographic cases of international software development using comparative analysis to explore if distance still mattered with the rapid development of collaboration technologies and people’s growing familiarity and experience with these technologies and remote work over the last decade. Their findings highlighted common ground and collaboration readiness as critical factors for data- and programming-intensive collaborations, and also indicated that collaborators in this context had much higher technology readiness, and that they preferred closely coupled work even when working remotely with each other.

In this work, we argue that we extend the Olsons’ analysis of the interdisciplinary collaboration to a new genre in data-centric open science projects. Thus, we use this framework to guide our coding of the interview data, and pay particular attention to what aspects may mismatch with the Olsons’ ”best practices of collaboration” suggestions (e.g., successful teams should have high level of common ground).

2.2. Teams and Infrastructures in Data-Centric Open Science Projects

Building upon the advanced computing tools and high-speed networks for collaboration and sharing resources of e-science (i.e., supporting collaborative research via electronic networks, also unknown as e-research, cyberinfrastructure, e-infrastructure and the Grid, etc.)  (Jirotka et al., 2013), the open science initiatives advocate for open access to, communication around as well as contribution to huge amounts of data sets, analytic tools, work practices and processes  (Schroeder, 2007).

In this context, novel forms of teams and ways of collaborations have emerged over recent years, transforming mere data and resource sharing base towards ecosystem-like communities of communication, practices and contributions (Bos et al., 2007). Accordingly, teams come in small and big, highly distributed geographically and self-organize themselves across traditional disciplines in the greater research community over the time.

One example of the new team collaboration form is the PRO-ACT database  (Atassi et al., 2014), which was initially developed to pool and integrate different data sources (clinical trials, patient records, medical and family history) relating to Amyotrophic Lateral Sclerosis (ALS). In addition to integrating these data, PRO-ACT launched two crowdsourcing competitions to the public since 2012 utilizing its database to promote ALS computational research  (ProACT, 2015). According to the official statistics, the 2012 competition attracted 1073 solvers from 64 countries and the 2015 one drew in 288 participants, and 80 final submissions by 32 teams from 15 countries within a period of three months. The winning best algorithms outperformed methods designed by challenge organizers as well as predictions by ALS clinicians, leading to major research publications  (Küffner et al., 2015).

One example of new ways of working is adopting Jupyter Notebook  (Kluyver et al., 2016). It allows interactive coding, visualizations, as well as building code narratives in the same UI  (Rule et al., 2018). Many extensions build on top of the Jupyter Notebook system have significantly improved the data scientists’ efficiency, such as the Voyager project  (Zhang, 2018) for data wrangling tasks  (Kandel et al., 2011) that replicate the Trifacta  (Trifacta, 2019) capabilities. Github  (Dabbish et al., 2012) is another popular code sharing and code version control platform. It supports various types of user access so that a user can set the data to be public or private. Many data scientists use it to host their code (often in Jupyter Notebook) and manage projects  (Rule et al., 2018).

Furthermore, components of machine learning and artificial intelligence have also entered the picture, in collaboration with human experts in the research fields 

(Gil et al., 2014). Building on open code sharing with common standards, open analytics platforms such as OpenML (Vanschoren et al., 2014) help users to quickly search for relevant analytical methods and reuse previous code in the community. Data analyses can be automatically processed and annotated in dataflow pipelines from how data is loaded, pre-processed and transformed, analyzed, and thus can promote mutual learning opportunities for human experts  (Patterson et al., 2017, 2018). Very recently, DataRobot  (datarobot, 2019), Google  (Google, 2019), H2O  (H2O, 2019), and IBM  (Filla, 2018) each have released a new AutoML solution, which aims to automatically finish low-level simple Machine Learning tasks so that Data Scientists can save some time and focus more on the higher-level tasks.

Novel forms of teams and ways of collaborations in the open science context can bring new opportunities and challenges at various steps of the data-centric collaboration process, including retrieving, preparing, and interpreting data (Muller et al., 2019), selecting methods for analysis (Patel et al., 2008), and evaluating correctness of results  (Kandogan et al., 2014). Hou and Wang (Hou and Wang, 2017) studied the data science process in an offline Civic Data Hackathon event. Through observation and interview research methods, they found that the broker theory is applicable to explain the tensions of collaboration between the NPO stakeholders and the data workers. Hill and his colleagues  (Hill et al., 2016) looked at the common collaboration barriers, such as communication challenges, between multiple stakeholders, and they found that non-expert collaborators have to treat the data science process as a black box, due to the lack of timely communication.

However, the above-mentioned studies either focus on only a subset of steps of the data-centric collaboration workflow (e.g., on the data sharing (Birnholtz and Bietz, 2003)), or on building a system or feature for a particular data science task (e.g., for data wrangling only  (Zhang, 2018)). The one that tried to provide a systematic account for the whole process failed to generalize their findings to the different forms of projects (e.g.,  (Hou and Wang, 2017) only looked at small teams in a data hackathon, and their unit of analysis always consisted of data volunteers working with NPOs). Thus in this paper, we contribute a comprehensive understanding of the collaborations in data-centric open science projects. And, we cover both small and large teams in data-centric collaborations.

2.3. Interdisciplinary Collaboration Teams

Olson and Olson’s framework for remote collaboration mostly addresses homogeneous teams with similar expertise or experience (i.e. software engineers, or bio-medical HIV/AIDS researchers in the examples above), without a direct focus on heterogeneous teams with diverse experts. In our context, the data-centric open science projects often consist of interdisciplinary teams with distinct expertise and roles, including data scientists as the analytics experts and bio-medical scientists as the domain content experts.

These teams have been a research focus in various research domains including HCI (e.g., (Bietz et al., 2012)) and cognitive science (e.g.,  (Hutchins, 1995; Gorman, 2002; Derry et al., 2014)). Despite some common understandings shared within the teams, a substantial portion of the domain knowledge and task understanding are distributed among different experts within the teams (Gorman, 2002). Team performance depends on how diverse knowledge are shared and integrated  (Derry et al., 2014; Van Knippenberg and Schippers, 2007). What to share and how much to share have always been a critical issue yielding mixed results. On the one hand, groups should be fully informed of different and unique perspectives in order to discover an optimal solution (and thus the more the better). Stasser and Titus  (Stewart and Stasser, 1998) found that in group decision-making, even though each person has unique knowledge, group members will have the propensity to discuss already shared information rather than novel, unshared information. This is known as the ”shared information bias”  (Stasser and Titus, 1985) and often prevents the group from finding the alternative solution, usually an ideal or optimal one (Fiedler et al., 2006; Mesmer-Magnus and DeChurch, 2009). On the other hand, comprehensive information sharing has pooling and exchange as well as integration cost and is inefficient. Gorman  (Gorman, 2008) argued that it does not require each individual to become fully known to each others’ expertise domain, but they only need to share a language enough to facilitate and evaluate team work.

In this section, we review related theories and research that addresses sharing and integrating diversities in interdisciplinary teams. We start with the third space theory that advocates pooling different perspectives in a separate common zone, and move on to the common ground theory that supports integration and management of differences.

2.3.1. Third Space and Hybridity

When collaborators from different disciplines work with each other, there often a ”boundary” between the two disciplines or communities. HCI researchers have proposed various theories to explain this phenomenon and these theories have guided the system design in supporting it. One notable theory that fits our context the most is the ”third space” that exists ”at the boundary of two disciplines” (Thackara, 2000; Muller and Druin, 2010).

Note that this concept is different from the ”third place” concept in  (Oldenburg and Brissett, 1982). It emerges from Bhabha’s critique of colonialism, where he described that a zone of ”hybridity” between two distinct cultures often came into existence spontaneously  (Bhabha, 1994). If each distinct culture was a ”space,” then the zone of hybridity, combining attributes of each culture, became something new, a ”third space” that separated but also mixed those cultures.

Warr (Warr, 2006) extended this notion into interaction between different disciplines, suggesting preserving the situated nature of each participant’s own world while creating a common space for resolving differences. Muller and Druin  (Muller and Druin, 2010) advocated the deliberate construction of a third space as part of the democratic agenda of participatory design. According to them, a third space is usually not ”owned” by anyone, and subsequently diverse voices can speak and be heard in such a hybrid environment, where people can compare, negotiate, and integrate goals, perspectives and vocabularies, as well as discuss shared meanings and protocols. In line with this notion, they argued that in addition to building common ground across disciplines, differences should be adequately examined, ”the mutual validation of diverse perspectives”, and become mutual learning opportunities (Bødker et al., 1988).

Within HCI, this concept of ”hybridity” has been mostly used in participatory design literature, where users and designers work together across each others’ disciplines to embark on a journey of negotiation, shared construction and and collective discovery. We argue that the data scientists and the bio-medical scientists in a collaboration in our context also construct a third space. As such, we expect that their behavior and their motivation in that space may differ from what they had before stepping into that zone. If so, we know various effective techniques to study and to support the collaborations in this space (e.g., spaces and places, narrative structures, games, and prototypes  (Muller and Druin, 2010)), thus we may be able to transfer these existing techniques to our context.

2.3.2. Common Ground: Content and Process

With richly distributed diverse knowledge, perspectives and roles in interdisciplinary teams, common ground is required to close the gaps between differences and in turn would enable sharing and communication more efficiently (Beers et al., 2006). This is especially important for teams of diverse experts collaborating on complex problems such as scientific research.

Common ground originally stems from the concept of grounding in the language and communication literature  (Clark, 1996) and has been extensively discussed in studies of Computer-Mediated Communication (CMC)  (Monk, 2003). It is defined as the sum of mutual, common, or joint knowledge, beliefs, suppositions and protocols shared between people when they are engaged in communications. And it is incrementally built on the history of joint actions between communicators.

In CSCW, where communication becomes part of and instrumental to work activity, common ground is distinguished between two types of coordination: content and process (Clark et al., 1991), which further delineates the Olsons’ general notion of common ground. Content common ground depends on an abundant shared understanding of the subject and focus of work (know that), while process common ground depends on a shared understanding as well as a continual updating of the rules, procedures, timing and manner by which the interaction will be conducted (know how).

Convertino and his colleagues studied the development of both types of common ground in an emergency management planning task that involved small teams of diverse experts. Their findings indicated that process common ground increased over time with decreasing information query or strategy discussions about how to organize activities, and in contrast, content common ground is created and tested through concept clarification and revision  (Convertino et al., 2008). Furthermore, to coordinate multiple roles within teams, they suggests that a multiple-view approach, which differentiates a shared team view from role-specific details, enables teams to filter out detailed differences, construct team strategies, and allows serendipitous learning about knowledge and expertise within the team  (Convertino et al., 2005), which lends support to our previous account of the third space in interdisciplinary teams.

In our interdisciplinary teams, there is a natural distinction of content domain expertise (i.e., bio-medical experts), and analytics process expertise (i.e., data scientists) when they come into collaborations with each other. We argue that the delineation of content and process common ground exists in these bio-medical research collaborations. Moreover, they may differ in what contains in content and process common ground from aforementioned communication and emergency management scenarios, which usually have a better-defined shared purpose and sometimes shared conventions and procedures as well. Additionally, over the time course, content and process common ground will also develop in different ways by both parties within teams and would need different support.

3. Methods

3.1. Participants

Participants were recruited through snowball sampling via recruiting emails. Snowball sampling has the major advantage of efficiently locating targeted participants with adequate research expertise, who may be remote. As bio-medical scientists are not common informants in HCI studies, it is hard to find a lot of them locally. We also acknowledge the limitations of snowball sampling, such as selection bias  (Atkinson and Flint, 2001), and we include more discussion in the Limitation section.

In total, 22 informants from 2 large enterprises (12 out of 22) and 10 research institutions (10 out of 22) in the U.S. were interviewed, reporting a variety of 26 research projects (see Table 1). Among them, 16 identified themselves with a major role of being a data scientist in the project, 6 with a role of being bio-medical scientist, and a few of them had a secondary role as a project manager or organizer. We have more data scientists due to the fact that, as participants reported, in the real-world practice, one bio-medical scientist often worked with multiple data scientists or a small domain expert panel consulted with a crowd of data scientists. The informants were quite experienced as they reported they had on average 5 years of experience in working in their expert domain (ranging from 3 years to 19 years). The projects they reported also covered a wide range of topics and team structures (from small teams with local and remote collaborators to large crowdsourcing collaborations). More details about informants and projects can be found in Table 1. Throughout this paper, data scientists will be denoted as ”DS”, bio-medical scientists as ”BMS”.

3.2. Semi-structured Interview

Semi-structured interviews were conducted during a 3-months period in the summer of 2017 as the main research method for this study, including 19 face-to-face interviews and 3 remote interviews using Skype audio chat and telephone. All the interviews were recorded and later transcribed into text. We asked the informants why they collaborated with the other domain, what data sets and tools they used, how they analyzed the data, how they communicated with each other, what outcomes they achieved. In particular, we encouraged them to recall their experience from one recent project, and we followed their storytelling with prompt questions. During the interview, informants were also asked to provide artifacts, such as source links to data sets, team meeting notes, project agendas, working documents, data analysis results and publications, presentation slides, questions and answers in community forums, and so on.

3.3. Data Analysis and Verification

The interview transcripts were first segmented into four dimensions of Olson’s framework (common ground, coupling of work, collaboration readiness, and technology readiness) as well as specified on content versus process sub-dimensions in the common ground dimension using a deductive coding approach (Crabtree and Miller, 1999). And then for each dimension, an inductive coding  (Boyatzis, 1998) was conducted to discover salient themes regarding data, tools, processes and people. Two coders iteratively coded the transcripts and discussed descriptive memos about emerging themes from the data, and then developed axial codes that captured relationships within and across dimensions. New codes were added when necessary until theoretical saturation  (Creswell and Creswell, 2017). In the end, the two coders cross-checked and compared their codes. If there was a disagreement, they revisited and discussed the theoretical framework and transcripts, and then made decisions about whether to keep the codes or disapprove and toss them out.

See pages - of profilev8.pdf

4. Findings

Guided by the Olsons’ framework, we organize the findings in the following order: coupling of work, collaboration readiness, technology readiness, and common ground.

4.1. The Coupling of Work

The coupling of work, as introduced in the related work section, is often related to the nature of the project topic. The projects reported by informants cover a wide range of topics from the fundamental scientific research such as exploring the cause of disease with cell or animal experiments, to the translational and applied research that aimed to develop new diagnostics, treatments, and other related applications.

4.1.1. Common Workflow.

Despite the variety of project topics, most of these reported projects follow a common high-level workflow. Figure  1 shows an ideal and trouble-free process. The bio-medical scientists collected or curated a data set, asked a research question, and discussed it with the data scientists. Then the bio-medical research question was translated into a data science question, and a solution to the latter DS question was implemented in modeling algorithms by the data scientists. There was a final evaluation step when the data scientists synced result interpretation and model evaluation with the bio-medical scientists. Apparently, this workflow of formulating bio-medical questions, translating to DS questions, implementing algorithms, and evaluating and sometimes revising the research questions is non-divisible and highly iterative (see the Common Ground section for more results).

”We brainstorm together and propose in the slack channel whenever someone has some new idea to test, try different models and quickly ITERATE prototypes in experiments to see if ideas work.”(I3, BMS, P3)

Figure 1. A simplified ideal version of common workflow

4.1.2. Team Structure and Coupling of Work.

In terms of the organizational structure, all the small-group teams are managed by a researcher in the team; while in large crowdsourcing collaboration projects, a management team of organizers or project managers were responsible for structuring, monitoring, consulting and managing sub-teams along the process.

The small teams often work in a closely-coupled work style and the common understanding about how to facilitate closely-coupled work is also applicable here. For example, timely communication and coordination are pointed out as essential for the success of these collaborations.

”… we have a lot of iterations, in deep, frequent conversations…we have weekly video meetings and frequent email checkups.” (I19, DS, p20)

In large-scale crowdsourcing collaborations, the aforementioned management team often helps to divide the bio-medical research question into various sub-questions, so that the multiple sub-teams working in this large collaboration can each at a time focus on one problem space which is specified clear enough, and can collaborate with other sub-teams in a loosely coupled manner. Additionally, the management team also makes efforts to regulate the proper level of coupling over throughout the process to clarify questions and engage participants.

”We also track forum questions [from other sub-teams], and provide feedback to clarify if anything [is] unclear about our data or questions…[we have] as well as webinar coaching sessions and expert advisory boards to engage participants [from other sub-teams] in learning.” (I22, DS, P5)

4.2. Collaboration Readiness

Collaboration readiness refers to the collaborators’ willingness and engagement level in a collaboration. Informants were asked why they collaborate, how they start to collaborate, and how their involvement proceeds over the process. From their answers, we can extract and identify the commonality of motivations in each of the two stakeholder groups (BMS, DS). We are also interested in whether their motivation and the level of engagement in the project remain the same while the project proceeds. We leave the findings about the mismatch of the motivations and engagements to the Common Ground section (See Section 4.4.4).

4.2.1. Challenges of Maintaining Motivation

At the beginning, people are all motivated to collaborate, because reciprocal skills and resources served as ”a natural attraction for collaborations” in the data-centric bio-medical projects (I12, DS, P12). However, these motivations and engagement levels from different experts in a team are always dynamically changing over time. Informants in small teams reported the tendency that their project soon became heavily dependent on the a few core members to manage the progress and divide the work, which can be very frustrated and reduce motivation and engagement in continuing the project.

”I sometimes feel others are too much dependent on me [as both project manager and domain expert]…The team can be paralyzed…stagnant without moving forward.” (I4, BMS)

In comparison, sub-teams in large crowdsourcing collaborations do not suffer from the heavy managemental overhead thanks to the separate management teams in the short term. However, these informants reported challenges in sustaining motivation in the longer term. These projects usually last 3 to 4 months. For many informants, it is a one-time deal. These collaborations are rarely developed into the next collaboration, especially if their solution did not came out as a winners of the internal competition, or with a concrete publication as the final credit. The short life span (a few months) of collaborations in these large crowdsourcing projects is quite opposite to traditional bio-medical research project’s long life cycle (years and decades).

”Only the top winners have the opportunity to collaborate on publications after the challenge … It is difficult to navigate to find collaborators in [large-crowd] challenge as we barely know each other.” (I8, DS, P8)

4.2.2. Reward Attribution and Over-Competing with Other Teams

Being the first and finding the best result, as the nature of scientific research, encourage the competition culture, which is also reported by many informants. Sometimes it prohibits collaborations to scale up, thus limits innovative scientific discoveries. For small teams, it is obvious that the researchers in one team are competing with other teams. So they do not want to share data, processes, or tools with other teams in the research community.

”We are not comfortable with sharing data or analyses before publication…[even if you share,] your work will not necessarily be acknowledged.” (I10, DS, P10)

In large crowdsourcing collaborations that involved multiple sub-teams, over-competition is also seen as a main factor prohibiting real scientific discovery. The leaderboard type of evaluation, where each sub-team could submit a solution and all the solutions are ranked using a test data set with one metric (e.g., prediction accuracy), is problematic for scientific discovery. It motivates every team to work towards a higher ranking on the leaderboard, instead of focusing on the Bio-medical scientific discovery (e.g., whether DS results are meaningful to the current BMS question), or to find new insights from the data outside the given DS question space. After all, scientific discovery is not only about finding incremental improvements as the right answer, it is also about asking the right and sometime disruptive questions inspired by the data.

”everyone copies and tweaks the best solution a bit to win a little, there is very limited innovation…but full of repetitive solutions.” (I5, BMS)

4.3. Technology Readiness

Informants reported usages of various technologies in the research process, supporting both content and progress common ground. And these technologies could be categorized into: Co-Editing systems, Communication systems, Co-Creation systems with version control, Data and code repositories, and Expertise systems (see Table 2). Co-Editing systems include Google Docs, Google Sheets and some other online editors, which informants used to plan or moderate project progress, and to organize project descriptions or progress summaries; Communication systems such as Slack, emails, and Skype are always useful for exchanging information quickly and tracking discussion threads; Git version control systems can help with organizing the data and code, and they are often integrated with a shared Data or Code repository system; and finally the expertise system consists of domain-specific knowledge (e.g., bio-medical ontology) where the DS collaborators can learn and query.

The challenges with teams’ technology readiness are intertwined with the collaborator’s backgrounds (being a DS or BMS), and are dynamically changing over time.

4.3.1. Information Needs and Tool Preferences

Informants in different roles reported different information needs, which resulted in different preferences over technology selections. BMSs have a focus on transparency and interpretability regarding the BMS problem, the data, the general process and the results, whereas DSs prioritize the performance, generalizability and efficiency of the DS model.

”It would be helpful to see a written documentation of pre-processing and if any transformations, any alternative methods considered or compared…These decision points can be seen clearly…lead to a trustworthy result interpretation.” (I4, BMS, P4)

”I would like to search for previous examples with similar data structures more efficiently… I also hope to extend my model built on the asthma data set as a recipe to cancer and other disease domains” (I12, DS, P12)

Secondly, the informant’s personal habits and social norms from their respective backgrounds also lead to different tool preferences. When the two backgrounds work together, they tend to find the overlapping tools that both parties can handle. This often results in the team selecting the most familiar tools that all members are comfortable with, rather than trying out more advanced new tools. When asked why the commonly used DS technologies, such as Jupyter notebooks and other cloud platforms, were not used in the team, informants explained that ”persuasion cost is high”(I10, DS), and ”training takes time”(I3, BMS). One BMS informant (I5) who also serves as an organizer role in a large crowdsourcing project, reported that he once tried to unify the selection of programming tools for all the sub-teams (a particular version of Python and a runtime environment) and that decision significantly reduced the sub-teams engagement and outcome.

”We specified everyone to use python and provide written documentation using specified format in one challenge… but participation rate was much lower compared to previous challenges…And we never ask to use a unified tool again.” (I5, BMS, P5).

4.3.2. Fragmented Information

Informants struggled a lot with the fragmented information all over the different systems and tools in a research project, especially in small-group collaborations where there is not a specialized management role in tracking and synthesizing information from tools used for different purposes and at different stages. This becomes more difficult when two types of common ground are managed sometimes using the same tools while at other times different tools over the process.

”we conduct analysis on local computers using our preferred coding tools and languages, use google docs to summarize project progress internally, present slides to share progress with other stakeholders, shoot quick thoughts to each other in emails or slack messages …”(I9, DS, P9)

4.4. Common Ground

Most informants reported that the major challenge in their collaboration was establishing and maintaining the common ground at the beginning of the project and maintaining it throughout the process.

4.4.1. Formulating the Initial BMS and DS Research Questions

The common ground in formulating research questions at the very beginning refers to that BMSs and DSs work together to define a bio-medical domain-specific research question, and transform it into a computable DS question.

In small teams, it is less challenging than in large crowdsourcing projects, as aforementioned that the collaborators in small teams are quite motivated to work together at the beginning. The two research questions (BMS one and DS one) converge. The BMSs believe that they want to find an answer to the BMS question, and the DSs believe that they interpret what BMSs want into a DS question and their job is to find an answer to the DS question.

”We are working on classification of disease progressive stages…We understand from our [BMS] collaborators … many rare disease lack proper measurement metrics to [indicate whether it is] cured or improved, and thus we have to see how clusters emerge from the data [before building classification model].” (I15, DS, P16)

In large-crowdsourcing projects, it is more complicated with greater ground to align. Often an expert panel consisting of organizers, BMSs and DSs is assembled to propose problems that are meaningful and impactful to BMS, as well as feasible and time-wise manageable for DS. Sometimes this expert panel even needs to have dry runs in which they simulate a team to work on this project to confirm that the question is resolvable within a period of time. I5(BMS) and I22(DS) have served in such a panel, they reported that planning such a large-crowdsourcing collaborations could take months.

”In question formulation, we involve different disciplines to ask proper questions. We consult a pool of experts to ensure the problem is important and feasible as well as clear to operate on.” (I5, BMS)

”when designing a data challenge, we would arrange a dry run internally, with 1 or 2 people proposing and running 2 or 3 algorithms individually, this serves as a baseline for participants” (I22, DS)

4.4.2. New Research Questions Emerge During the Project Process

It may not be a surprise to the readers that the scientific research questions keep evolving quickly along with the project progress, but it is definitely a surprise and frustration to some of our informants. Many of them reported starting with one particular question and ended up with ”a set of totally different questions” (I19, DS, P20), or sometimes ”better questions”(I3, BMS, P3). In small-group collaborations, new questions emerge more frequently throughout the project process while in large-crowd collaborations, new questions often emerge at the end that point to future research directions.

”our question evolves from what is addiction, to a set of very different questions like what is overdose, to what is abuse, to what is dependence? [This] depends on the ground truth we actually have from the data … we later decide to focus on morphine and hypothesize about differences between natural versus herbal ones and synthesized.” (I19, DS, P20)

Sometimes the evolved research question is a better question, and the ”right question”(I3, BMS, P3) to ask when compared to the original one. Thus, finding an answer to the original question is less important.

”we started out to ask what is stress, which context causes stress, how to measure stress … over time we decided to focus on disease-related stress and how to build applications to monitor and design interventions…a much better question…more impactful.)”(I3, BMS, P3)

Overall, such evolution of questions breaks the initial common ground and requires dynamically building the new common ground. In the earlier stage, the BMSs thought they want to find the right answer and the DSs agree to find the right answer to the initial DS question. Then as the project unfolds, the BMSs may or may not realize that their true interest has changed from finding the right answer to finding the right question by asking more possible questions. Sometimes if this change is not clearly expressed, the common ground is broken. Low level of common ground, though to some extent good for allowing scientific discoveries to evolve over time, causes confusion in the team.

In I9 (DS)’s case, the initial problem raised by their BMS colleague was to design and arrange treatment resources for sepsis patients and this was transformed into a DS problem that predicts patients’ life span before mortality. As DSs worked on defining mortality and dealing with missing information in the data set, BMSs came up with more questions regarding types and progression of disease severity which allow them to focus on understanding patients at different stages with various symptoms. As the poor DS commented,

”We were lost in which model to build and which outcome we should focus on…”(I9, DS, P9)

4.4.3. Obscure Data

Two reasons are stated as the cause of such evolution of research questions, Obscure Data and BMS’s intention to ”Ask the Right Question”. The open science context provided much easier access to the raw data curated and collected by other researchers in the community, but did not necessarily guarantee easy understanding of the data. Ambiguity, bias and potential missing information in bio-medical variables are particularly troublesome. Contextual information like medical practices, clinical trial routines, regulations and direct impacts on patients does not come with the meta-data or protocols but are essential for making sense of the data and asking the right question. It is a critical issue for both small-group and large-crowdsourcing projects.

”I have to check across a lot of sources to clarify the implications and rule out ambiguity and biases, including standard diagnosis codes like ICD-9, pharmacy diagnosis, enrollment insurance types, typical patient demographics specific to the disease.” (I4, BMS, P4)

DSs reported the importance for BMSs to communicate the ”data structure” with them in understanding features and relationships in the data sets. But there lacks a consistent definition and understanding of the data structure and a common language to communicate and discuss it.

”[Such information] is a hidden knowledge, a sense, and mostly gained from experience and becomes your routine” (I7, DS, P7).

From project to project, data structures appear in different forms, jargons and routines and are a composite concept of experiential knowledge containing:

”data types and distributions like if cross-sectional or longitudinal or matrix and if there’s seasonality or skew; whether there is a clear binary or continuous outcome for analysis or it is high dimensional multivariate data” (I13, DS, P13)

Difficulties in communicating data structures could lead to further challenges in evaluating the methods and interpreting the results, and cause BMSs’ frustration and distrust around this ”big black box”, as quoted from I4 (BMS, P4) and I17 (DS, P18).

4.4.4. Ask The Right Question

What are alternative ways to ask questions? BMS informants often reported their intentions to ask the right question by asking more alternative research questions besides the initial one. They are also frustrated that they do not know if the translated DS question is a good one or not. DSs are trained to abstract and simplify a realistic problem into a analyzable and computable one, thus BMS Problems are more often translated into Prediction problems, in which an outcome is well-defined, and the model and algorithm is ”mature and well developed”, and the evaluation is standardized by ”a mathematical loss function” (I20, DS, P22&P23).

”In our bio-medical training, alternative hypotheses are important ways to conduct research. I conducted a lot of literature review to understand what has been established and what is the gap in reasoning. However, when translating a bio-medical question into a data science one, I often wonder what are alternatives. The process seems to be very intransparent.”(I4, BMS, P4)

DSs’ prone-to-predict tendency could be explained by both different interests and evaluation criteria valued and rewarded by BMS and DS fields. And it adds to the misalignments in the common ground as DSs are partially instrumental to BMS. BMSs are mostly interested in the results which are meaningful for interpretations and useful for interventions; DSs are driven by developing competitive, innovative and sophisticated methods such as ”no one has tried before” (I12, DS, P12), ”beat existing methods in accuracy” (I8, DS, P8), ”complex mathematical models” (I15, DS).

”discovery [instead of prediction] that can be useful to provide actionable insights for high-stake life or death issues … we are always reproducing predictive models with higher predictive capabilities in the field. However, bio-medical problems rarely have a clear outcome to make predictions… we are more interested in what intervention can be done rather than whether a prediction is accurate.” (I4, BMS, P4)

In small project teams, this prone-to-predict tendency seems more severe; while in large-crowdsourcing collaborations, wisdom of the crowd is able to pool diverse perspectives and considerations to look at the same problem.

”At later stage of the data challenge, an ensemble method, which is a linear combination, was applied to aggregate across the winning teams’ individual models, to learn from different focuses and merits in different approaches and for discovering new insight.” (I20, DS, P23)

5. Discussion

5.1. Successful Collaborations In Scientific Discovery

These reported open science projects are characteristic of their team sizes (small or big scale), distinct complementary interdisciplinary nature (bio-medical as the content domain and data science as the solution domain), tight collaboration process (rather than simply resource sharing), as well as the long-term and transformational nature of scientific discovery. Using the Olsons’ four-dimensions framework for successful distributed collaborations in scientific research (coupling of work, common ground, collaboration readiness, and technology readiness), we organize our results according to this framework, and focus mostly on the common ground dimension as the major challenge. Particularly in the diverse contexts of data-centric open science projects, we take into accounts the contrasts of small and big teams, and the dynamically evolving nature of scientific discovery.

5.1.1. Coupling of Work, Collaboration Readiness, and Technology Readiness

Our findings regarding the coupling of work echo what the Olsons’ framework suggests: the tight coupling within small teams requires timely communication and coordination. Loose coupling was a pre-requisite for successful distributed collaborations, such as in the large-scale crowdsourcing projects. However, most of these open science projects were non-divisible and highly iterative, which made assigning modular work for each location and setting up routine impossible. Similar to the result from a previous study  (Bjørn et al., 2014), tight coupling under proper management, was not challenged by remote technologies but rather helped to enhance common ground and collaboration readiness.

Our findings suggest that collaboration readiness is challenging within small project teams as well as in sub-teams in large-scale projects. In the Olsons’ original framework, collaboration readiness was seen as how team members were motivated to engage with each other. However, in these reported open science projects, more aspects of organizational structures came into play, including dependence between different expertise within teams, relationships between teams, as well as over the time dimension.

Similar to previous research on domain experts collaborating with computer scientists in cyberinfrastructure  (Lee et al., 2010) or in civic data hackathons (Hou and Wang, 2017), each party comes in with a different research agenda, which is analyzed as ”the dual-goal dilemma”. This tension also exists between BMSs and DSs in our study, and manifests itself into the tension between asking the right question versus finding the answer in common ground. It is important to carefully weigh both sides’ interests in the organizational structure of the team; otherwise either one side will become ”merely” instrumental as consultants and implementers to the other (Atkins et al., 2003).

We could also learn from existing successful experiences. The introduction of a broker role to serve as the bridge between domain experts and data scientists to translate one stakeholder’s goals to the other proved useful in civic data hackathons (Hou and Wang, 2017) and large-scale collaborations (Wenger, 2010; Paepcke, 1996; Pawlowski and Robey, 2004). Thus, we expect to see a smoother and more successful collaboration if someone in the collaboration can play the broker role.

In terms of technology readiness, informants reported a wide range of tools, ranging from Co-editing systems to communication systems, and the reported use practices are consistent with prior literature (e.g.,  (Wang, 2016; Wang et al., 2019)) thus are not listed. At the same time, BMSs and DSs have different information needs and tool preferences and when they come into collaboration as a team, they usually choose the most familiar tools for all the members (mostly aligning with BMSs’ tool comfortableness) rather than trying out new advanced tools. This is similar to prior findings on Co-editing technologies  (Wang et al., 2017), and a National Science Foundation report warned if domain experts are weighted too heavily in the organization, procurement of existing technologies will be much overemphasized compared to development or adoption of new technologies  (Atkins et al., 2003). Furthermore, our informants also expressed concerns of managing multiple tools as well as trying out new tools to meet the needs of quickly-evolving common ground. In particular, tool interoperability between team members and across the research process was critical. Compared to project management in general workplaces, managing interdisciplinary research projects can be more difficult due to their ambiguous and ever-evolving nature, and to the lack of awareness and resources allocated to management (Kirkman et al., 2012). Thus, a training of project management and new tool adoption may be helpful.

5.1.2. Ever-Evolving Common Ground and Better Scientific Discovery

We found that common ground continued to be a key issue for both small and large-scale project teams in open science. In our findings, a ”third space”  (Muller and Druin, 2010) naturally came into being when BMS and DS started collaboration. In this shared common space, separate from each of their own domain, BMS and DS initiated a concrete common ground of what the BMS and DS research questions are, building dialogues and terms around the ”data structure” with hybrid languages and training from their distinct domains, negotiating tools shared by the entire group, as well as showing promises in constructing new understandings of the initial problem. In particular, boundaries between BMS and in this ”third space” continued to blur and thus new possibilities of asking questions emerged. Many informants in our study reported the unexpected turns of their research projects, starting from one question and ending by answering another better question or coming up with more alternative questions.

This echoes Convertino’s previous findings in group emergency management (Convertino et al., 2008). In both cases, process common ground regarding know-how seems to keep increasing through joint activities within the team, while content common ground keeps being re-articulated, broken and revised throughout the process. Different from teams in general workplaces that are driven efficiently towards clear business goals and specific performance evaluations that match one optimal solution to a well-specified problem (Olson and Olson, 2000), this differentiation of content and process common ground and their development and interaction with each other over the time course become more salient and critical. And in our context of bio-medical research collaboration, the content common ground is in the form of research questions encapsulating a complicated composite of variables and relationships in ambiguous data sets, and the training and nature of Bio-medical research to ”Ask the Right Question” as well as research goals, in comparison to new concepts and terms in emergency management context (Convertino et al., 2008). This is consistent with the account that scientific discovery teams are operating on the foundation of alternative explanations and different voices, to explore and rule out many possibilities rather than exploiting a set of existing successful solutions (Sijtsma, 2016).

Moreover, the increasing process common ground, in fact, allows the breaking and updating of content common ground to be possible. Specifically, the need for new communication protocol around what is ”the right question” is on the rise over the research process. Further effort is needed to recognize changes in both types of common ground from both BMS and DS communities. Failing to do so may cause confusion and low productivity, less ideal scientific discovery. For example, teams could get confused about what is the current content common ground without the support of increasing process common ground, get ”frozen” with the established content common ground  (Kruglanski and Webster, 1996), ”seized” by shared information bias  (Stasser and Titus, 1985) and settle on ”premature consensus” or ”early closure”  (Kerr and Tindale, 2004) of less optimal questions or solutions instead of advancing to the next stage of scientific discovery.

In order to examine the validity of this preliminary finding and understand detailed needs of BMS and DS, further research is necessary to devise measurements for both content and process common ground specific to bio-medical research collaboration in the wild compared to in controlled experimental settings (Convertino et al., 2007, 2008).

5.2. Principles for Technology Design

From our findings, the biggest challenge in open science projects seemed to be the quickly-evolving common ground with a purpose to advance scientific discovery by asking the right question, instead of finding answers within a constrained space. It affects the other three dimensions in the Olsons’ framework: coupling of work, collaboration readiness, and technology readiness. It is also related to the theme of integration of heterogeneity from the seven common themes for designing and researching current and future e-Research cyberinfrastructures, articulated by Ribes and Lee in a theoretical summary  (Ribes and Lee, 2010). We refer to the related literature and discuss principles and potential designs to address this issue.

For both small-group and large-crowdsourcing collaborations, asking the right question depends on steadily developing progress common ground in terms of conventions and procedures, while constantly re-establishing content common ground through more and better questions as the research focus. Consistent with the ”third space” in interdisciplinary collaborations, a multiple-view approach that differentiates a shared team view from role-specific details has been found to be effective for group tasks (Convertino et al., 2005). In terms of what is to be shared in the common view, two principles are suggested here. Firstly, a divergent-to-convergent two-stage path  (Paletz and Schunn, 2010) to help structure the tightly coupled communication in the ”third space”. This path starts from pooling and sharing different perspectives for more questions, and heads to comparing and evaluating for better questions. The communication systems reported in Table.2 may be further improved to support and keep track of this divergent-to-convergent model by explicitly enabling users to brainstorm ideas, then summarizing ideas, and later evaluating the different ideas in it. Secondly, it would be helpful to differentiate the two types of common ground as they develop differently over time and affect team members and their roles differently. For example, team members can see not only the current status of shared objects, but also the changes in historical states (Greenberg, 1990). This would be similar to how today’s co-editing systems (Table 2), integrates with version control systems (Wang et al., 2015), raising awareness of changes over time in separate views for common content knowledge and process protocols.

5.3. Project Management Guideline

On the other hand, a non-technical solution may be complementary to the technical ones for the small teams without a specialized project manager that face challenges in managing fragmented and repetitive information, or maintaining collaboration readiness over time. This might be due to the lack of awareness, expertise and resource allocation to project management compared to the conduct of the science research (Olson and Olson, 2013). Training workshops could be helpful for researchers to learn about good team leadership, facilitation, and process management. Leveraging on existing technology, a shared vocabulary wiki page and data documentation could be helpful for the DSs and BMSs to keep in sync of the understanding and collaboration awareness, what questions the BMSs are interested in right now, and what questions the DSs are working on. Furthermore, specialized project management tools with interoperability across other tools could be developed to address such issue.

5.4. AI as a Partner in the Future of Data-Centric Scientific Discovery

We have seen a gap between the BMS and DS in our study in the sense of asking questions, translating the BM question into a correct DS question, and interpreting the DS results. BMSs sometimes distrust the results. And DSs sometimes have a different priority in methodologies and solutions that might over-simplify the question. More importantly, shown in our results, BMSs need an iterative loop with lots of redundant DS attempts to be inspired by the data, the models, and the results generated by DSs.

These differences, if not properly shared, communicated and integrated within the group, could become hidden biases that hold back the progress of scientific discovery. The work of Tversky and Kahneman  (Tversky and Kahneman, 1974; Kahneman, 2011) argues that people, even scientists and data scientists who are professional in analyzing data, have trouble thinking statistically and reasoning about the data. This contributes to the growing reproducibility crisis in recent years, in which results of many scientific studies are difficult or impossible to replicate in subsequent investigation (Peng, 2015; Staddon, 2017). And it can have a significant impact on judgments and decisions around data and even reverse decisions. It has been a robust phenomenon in bio-medical field, affecting diagnosis, treatment and lifesaving, medical resource allocation and management (Kühberger, 1998; Gamliel and Peer, 2010; Armstrong et al., 2002).

In recent years we have seen a fast and vast research effort of using one special group of machine learning techniques to design another machine learning algorithm  (Nargesian et al., 2017; Liu et al., 2019)

. In particular, AutoML (automated machine learning) refers to a type of technology that only requires users’ minimal effort in uploading the data set, specifying the target and the DS method type (e.g., regression or binary classification), then the AI can automatically generate new features, select features, search alternative models and tune the models’ parameters to reach an optimal solution (often quantified in accuracy metric) 

(Khurana et al., 2016)

. With these systems, now the non-data-scientist users like BMSs in this paper may have the capability to directly build machine learning models with their domain-specific research questions. In a potential AI-human collaboration future, BMSs and DSs can leverage AutoML systems to quickly generate many ways to ask questions (including predictions and open discoveries) at different stages of the research process, and the machine may have less biased judgments despite the DSs’ or BMSs’ competing interests. AutoML may never fully liberate the human DSs, but we expect it could work as a partner in the human DS teams (e.g., as conversational agents illustrated in 

(Shamekhi et al., 2018)) and help the BMSs in this Right Question formulation process. Certainly it is hard to achieve because in addition to technical development, many non-technical aspects (e.g., anthropomorphism (Tan et al., 2018)) need to been taken into account. But, we choose to work toward this future because it is hard.

6. Limitations

One limitation of this study is the snowball sampling method, which might introduce selection bias  (Atkinson and Flint, 2001)

. These informants within the reach of our social network might be above the average active level in participating in open science collaborations and report more positive experiences. Additionally, all our informants are based in the U.S., which do not necessarily represent diverse cultural differences and a wide range of geographical distances in open science collaborations.

The semi-structured interview method is also limited in relying on informants’ self-reports, which are subjective, single-sided and probably over-simplified. In order to understand the details of dynamic interaction between experts from different disciplines, it is important to design specific measurement for both content and process common ground, and observe contextual interaction within teams in real scenarios and conduct longitudinal case studies to track their processes along the research pipeline.

Lastly, we picked bio-medical research as our target domain and it is yet to be studied how these challenges would vary for other domains involved in data-centric collaborations in open science, such as physics, geology, psychology.

7. Conclusion

This work reports the challenges that emerged from scientific collaborations between data scientists and bio-medical scientists through interviewing 22 participants. Our study contributes to the existing literature by providing a systematic account for different stakeholders’ practices in scientific collaborations. In particular, we differentiate content common ground versus process common ground as a finer-grained level of the common ground concept. We discovered that scientific collaborations require constant breaking of the content common ground while accumulating process common ground, in comparison to most decision making or problem solving scenarios, where only one decision or solution is the final product. Our results shed light on the better practices for future interdisciplinary scientific collaborations. And the system design suggestions are also valuable and actionable for developers and designers who are developing data analytic tools and cloud sharing platforms.


We thank all the interviewees who shared their research stories and resources. This work was conducted under the auspices of the IBM Science for Social Good initiative.


  • K. Armstrong, J. S. Schwartz, G. Fitzgerald, M. Putt, and P. A. Ubel (2002) Effect of framing as gain versus loss on understanding and hypothetical treatment choices: survival and mortality curves. Medical Decision Making 22 (1), pp. 76–83. Cited by: §5.4.
  • N. Atassi, J. Berry, A. Shui, N. Zach, A. Sherman, E. Sinani, J. Walker, I. Katsovskiy, D. Schoenfeld, and M. Cudkowicz (2014) The PRO-ACT database Design, initial analyses, and predictive features. Neurology 83 (19), pp. 1719–1725. Cited by: §2.2.
  • D. E. Atkins, K. K. Droegemeier, S. I. Feldman, H. Garcia-Molina, M. L. Klein, D. G. Messerschmitt, P. Messina, J. P. Ostriker, and M. H. Wright (2003) Revolutionizing science and engineering through cyberinfrastructure. Report of the National Science Foundation blue-ribbon advisory panel on cyberinfrastructure 1. Cited by: §5.1.1, §5.1.1.
  • R. Atkinson and J. Flint (2001) Accessing hidden and hard-to-reach populations: snowball research strategies. Social research update 33 (1), pp. 1–4. Cited by: §3.1, §6.
  • P. J. Beers, H. P. Boshuizen, P. A. Kirschner, and W. H. Gijselaers (2006) Common ground, complex problems and decision making. Group Decision and Negotiation 15 (6), pp. 529–556. Cited by: §2.3.2.
  • H. Bhabha (1994) The location of culture. london: routledge.. Cited by: §2.3.1.
  • M. J. Bietz, S. Abrams, D. M. Cooper, K. R. Stevens, F. Puga, D. I. Patel, G. M. Olson, and J. S. Olson (2012)

    Improving the odds through the collaboration success wizard

    Translational behavioral medicine 2 (4), pp. 480–486. Cited by: §2.3.
  • J. P. Birnholtz and M. J. Bietz (2003) Data at work: supporting sharing in science and engineering. In Proceedings of the 2003 international ACM SIGGROUP conference on Supporting group work, pp. 339–348. Cited by: §2.2.
  • J. P. Birnholtz and T. A. Finholt (2013) Cultural challenges to leadership in cyberinfrastructure development. Leadership at a distance: research in technologically-supported work, pp. 195. Cited by: §1.
  • P. Bjørn, M. Esbensen, R. E. Jensen, and S. Matthiesen (2014) Does Distance Still Matter? Revisiting the CSCW Fundamentals on Distributed Collaboration. ACM Transactions on Computer-Human Interaction 21 (5), pp. 1–26 (en). External Links: ISSN 10730516, Document Cited by: §2.1, §5.1.1.
  • S. Bødker, P. Ehn, J. Knudsen, M. Kyng, and K. Madsen (1988) Computer support for cooperative design. In Proceedings of the 1988 ACM conference on Computer-supported cooperative work, pp. 377–394. Cited by: §2.3.1.
  • N. Bos, A. Zimmerman, J. Olson, J. Yew, J. Yerkie, E. Dahl, and G. Olson (2007) From shared databases to communities of practice: A taxonomy of collaboratories. Journal of Computer-Mediated Communication 12 (2), pp. 652–672. Cited by: §1, §2.2.
  • R. E. Boyatzis (1998) Transforming qualitative information: thematic analysis and code development. sage. Cited by: §3.3.
  • J. Carpenter (2011) May the best analyst win. American Association for the Advancement of Science. Cited by: §1.
  • CASP13 (2018) 13th community wide experiment on the critical assessment of techniques for protein structure prediction. External Links: Link Cited by: §1.
  • CERN (2018) CERN annual report 2017. External Links: Link Cited by: §1.
  • H. H. Clark, S. E. Brennan, et al. (1991) Grounding in communication. Perspectives on socially shared cognition 13 (1991), pp. 127–149. Cited by: §2.3.2.
  • H. H. Clark (1996) Using language. Cambridge university press. Cited by: §2.3.2.
  • G. Convertino, C. H. Ganoe, W. A. Schafer, B. Yost, and J. M. Carroll (2005) A multiple view approach to support common ground in distributed and synchronous geo-collaboration. In Coordinated and Multiple Views in Exploratory Visualization (CMV’05), pp. 121–132. Cited by: §2.3.2, §5.2.
  • G. Convertino, H. M. Mentis, M. B. Rosson, J. M. Carroll, A. Slavkovic, and C. H. Ganoe (2008) Articulating common ground in cooperative work: content and process. In Proceedings of the SIGCHI conference on human factors in computing systems, pp. 1637–1646. Cited by: §2.3.2, §5.1.2, §5.1.2.
  • G. Convertino, H. M. Mentis, A. Y. Ting, M. B. Rosson, and J. M. Carroll (2007) How does common ground increase?. In Proceedings of the 2007 international ACM conference on Supporting group work, pp. 225–228. Cited by: §5.1.2.
  • B. F. Crabtree and W. L. Miller (1999) Doing qualitative research. sage publications. Cited by: §3.3.
  • J. W. Creswell and J. D. Creswell (2017) Research design: qualitative, quantitative, and mixed methods approaches. Sage publications. Cited by: §3.3.
  • L. Dabbish, C. Stuart, J. Tsay, and J. Herbsleb (2012) Social coding in GitHub: transparency and collaboration in an open software repository. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, pp. 1277–1286. Cited by: §2.2.
  • (2019) Data.gove datasets. External Links: Link Cited by: §1.
  • datarobot (2019) Datarobot. External Links: Link Cited by: §2.2.
  • S. J. Derry, C. D. Schunn, and M. A. Gernsbacher (2014) Interdisciplinary collaboration: an emerging cognitive science. Psychology Press. Cited by: §2.3.
  • P. Dourish and V. Bellotti (1992) Awareness and coordination in shared workspaces.. In CSCW, Vol. 92, pp. 107–114. Cited by: §2.1.
  • K. Fiedler, P. Juslin, et al. (2006) Information sampling and adaptive cognition. Cambridge University Press. Cited by: §2.3.
  • G. Filla (2018) What’s new with watson machine learning?. External Links: Link Cited by: §2.2.
  • E. Gamliel and E. Peer (2010) Attribute framing affects the perceived fairness of health care allocation principles. Judgment and Decision Making 5 (1), pp. 11. Cited by: §5.4.
  • GenBank (2019) GenBank statistics. External Links: Link Cited by: §1.
  • Y. Gil, M. Greaves, J. Hendler, and H. Hirsh (2014) Amplify scientific discovery with artificial intelligence. Science 346 (6206), pp. 171–172. Cited by: §2.2.
  • Google (2019) Cloud automl. External Links: Link Cited by: §2.2.
  • M. E. Gorman (2002) Expanding the trading zones for convergent technologies. Converging Technologies for Improving Human Performance, pp. 424. Cited by: §2.3.
  • M. E. Gorman (2008) Scientific and technological expertise.. Journal of psychology of science and technology. Cited by: §2.3.
  • B. Gray (1989) Collaborating: finding common ground for multiparty problems. Cited by: §2.1.
  • S. Greenberg (1990) Sharing views and interactions with single-user applications. In ACM SIGOIS Bulletin, Vol. 11, pp. 227–237. Cited by: §5.2.
  • H2O (2019) External Links: Link Cited by: §2.2.
  • C. Hill, R. Bellamy, T. Erickson, and M. Burnett (2016) Trials and tribulations of developers of intelligent systems: A field study. In Visual Languages and Human-Centric Computing (VL/HCC), 2016 IEEE Symposium On, pp. 162–170. Cited by: §2.2.
  • Y. Hou and D. Wang (2017) Hacking with npos: collaborative analytics and broker roles in civic data hackathons. Proceedings of the ACM on Human-Computer Interaction 1 (CSCW), pp. 53. Cited by: §1, §1, §2.2, §2.2, §5.1.1, §5.1.1.
  • E. Hutchins (1995) How a cockpit remembers its speeds. Cognitive science 19 (3), pp. 265–288. Cited by: §2.3.
  • M. Jirotka, C. P. Lee, and G. M. Olson (2013) Supporting scientific collaboration: methods, tools and concepts. Computer Supported Cooperative Work (CSCW) 22 (4-6), pp. 667–715. Cited by: §1, §2.2.
  • D. Kahneman (2011) Thinking fast and slow. allen lane. Penguin. Cited by: §5.4.
  • S. Kandel, A. Paepcke, J. Hellerstein, and J. Heer (2011) Wrangler: interactive visual specification of data transformation scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 3363–3372. Cited by: §2.2.
  • E. Kandogan, A. Balakrishnan, E. M. Haber, and J. S. Pierce (2014) From Data to Insight: Work Practices of Analysts in the Enterprise. IEEE Computer Graphics and Applications 34 (5), pp. 42–50 (en). External Links: ISSN 0272-1716, Document Cited by: §2.2.
  • N. L. Kerr and R. S. Tindale (2004) Group performance and decision making. Annu. Rev. Psychol. 55, pp. 623–655. Cited by: §5.1.2.
  • U. Khurana, D. Turaga, H. Samulowitz, and S. Parthasrathy (2016)

    Cognito: automated feature engineering for supervised learning

    In 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 1304–1307. Cited by: §5.4.
  • B. L. Kirkman, C. B. Gibson, and K. Kim (2012) Across borders and technologies: advancements in virtual teams research. In The Oxford Handbook of Organizational Psychology, Volume 2, Cited by: §5.1.1.
  • T. Kluyver, B. Ragan-Kelley, F. Pérez, B. E. Granger, M. Bussonnier, J. Frederic, K. Kelley, J. B. Hamrick, J. Grout, S. Corlay, et al. (2016) Jupyter notebooks-a publishing format for reproducible computational workflows.. In ELPUB, pp. 87–90. Cited by: §2.2.
  • A. W. Kruglanski and D. Webster (1996) Motivated closing of the mind: its cognitive and social effects. Psychological Review 103 (2), pp. 263–283. Cited by: §5.1.2.
  • R. Küffner, N. Zach, R. Norel, J. Hawe, D. Schoenfeld, L. Wang, G. Li, L. Fang, L. Mackey, O. Hardiman, et al. (2015) Crowdsourced analysis of clinical trial data to predict amyotrophic lateral sclerosis progression. Nature biotechnology 33 (1), pp. 51. Cited by: §2.2.
  • A. Kühberger (1998) The influence of framing on risky decisions: a meta-analysis. Organizational behavior and human decision processes 75 (1), pp. 23–55. Cited by: §5.4.
  • K. A. Lawrence (2006) Walking the Tightrope: The Balancing Acts of a Large e-Research Project. Computer Supported Cooperative Work (CSCW) 15 (4), pp. 385–411 (en). External Links: ISSN 0925-9724, 1573-7551, Document Cited by: §1.
  • C. P. Lee, M. J. Bietz, and A. Thayer (2010) Research-driven stakeholders in cyberinfrastructure use and development. In 2010 International Symposium on Collaborative Technologies and Systems, pp. 163–172. Cited by: §5.1.1.
  • S. Liu, P. Ram, D. Bouneffouf, D. Vijaykeerthy, G. Bramble, H. Samulowitz, D. Wang, A. R. Conn, and A. Gray (2019) A formal method for automl via admm. External Links: 1905.00424 Cited by: §5.4.
  • A. Luo, D. Ng’ambi, and T. Hanss (2010) Towards building a productive, scalable and sustainable collaboration model for open educational resources. In Proceedings of the 16th ACM international conference on Supporting group work, pp. 273–282. Cited by: §1.
  • J. R. Mesmer-Magnus and L. A. DeChurch (2009) Information sharing and team performance: a meta-analysis.. Journal of Applied Psychology 94 (2), pp. 535. Cited by: §2.3.
  • A. Monk (2003) Common ground in electronically mediated communication: clark’s theory of language use. HCI models, theories, and frameworks: Toward a multidisciplinary science, pp. 265–289. Cited by: §2.3.2.
  • M. J. Muller and A. Druin (2010) Participatory design: the third space in hci. human-computer interaction: development process. j. jacko and a. Sears. Eds. Handbook of HCI. Cited by: §2.3.1, §2.3.1, §2.3.1, §5.1.2.
  • M. Muller, I. Lange, D. Wang, D. Piorkowski, J. Tsay, Q. V. Liao, C. Dugan, and T. Erickson (2019) How data science workers work with data: discovery, capture, curation, design, creation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 126. Cited by: §1, §2.2.
  • F. Nargesian, H. Samulowitz, U. Khurana, E. B. Khalil, and D. S. Turaga (2017) Learning feature engineering for classification.. In IJCAI, pp. 2529–2535. Cited by: §5.4.
  • R. Oldenburg and D. Brissett (1982) The third place. Qualitative sociology 5 (4), pp. 265–284. Cited by: §2.3.1.
  • G. M. Olson and J. Olson (2016) Converging on theory from four sides. Theory development in the Information Sciences. Ed. D. Sonnenwald. Univ. Of Texas, Austin, pp. 87–100. Cited by: §1.
  • G. M. Olson and J. S. Olson (2000) Distance matters. Human–computer interaction 15 (2-3), pp. 139–178. Cited by: §1, §2.1, §5.1.2.
  • G. M. Olson, S. Teasley, M. J. Bietz, and D. L. Cogburn (2002) Collaboratories to support distributed science: the example of international hiv/aids research. In Proceedings of the 2002 annual research conference of the South African institute of computer scientists and information technologists on enablement through technology, pp. 44–51. Cited by: §1, §2.1.
  • G. M. Olson, A. Zimmerman, and N. Bos (2008) Scientific collaboration on the internet. The MIT Press. Cited by: §1.
  • J. S. Olson and G. M. Olson (2013) Working together apart: collaboration over the internet. Synthesis Lectures on Human-Centered Informatics 6 (5), pp. 1–151. Cited by: §1, §5.3.
  • J. S. Olson, D. Wang, G. M. Olson, and J. Zhang (2017) How people write together now: beginning the investigation with advanced undergraduates in a project course. ACM Transactions on Computer-Human Interaction (TOCHI) 24 (1), pp. 4. Cited by: §1.
  • A. Paepcke (1996) Information needs in technical work settings and their implications for the design of computer tools. Computer Supported Cooperative Work (CSCW) 5 (1), pp. 63–92. Cited by: §5.1.1.
  • S. B. Paletz and C. D. Schunn (2010) A social-cognitive framework of multidisciplinary team innovation. Topics in Cognitive Science 2 (1), pp. 73–95. Cited by: §5.2.
  • K. Patel, J. Fogarty, J. A. Landay, and B. L. Harrison (2008) Examining Difficulties Software Developers Encounter in the Adoption of Statistical Machine Learning.. In AAAI, pp. 1563–1566. Cited by: §2.2.
  • E. Patterson, I. Baldini, A. Mojsilovic, and K. R. Varshney (2018) Semantic representation of data science programs.. In IJCAI, pp. 5847–5849. Cited by: §2.2.
  • E. Patterson, R. McBurney, H. Schmidt, I. Baldini, A. Mojsilović, and K. R. Varshney (2017) Dataflow representation of data analyses: toward a platform for collaborative data science. IBM Journal of Research and Development 61 (6), pp. 9–1. Cited by: §2.2.
  • S. D. Pawlowski and D. Robey (2004) Bridging user organizations: knowledge brokering and the work of information technology professionals. MIS quarterly, pp. 645–672. Cited by: §5.1.1.
  • R. Peng (2015) The reproducibility crisis in science: a statistical counterattack. Significance 12 (3), pp. 30–32. Cited by: §1, §5.4.
  • ProACT (2015) The dream phil bowen als prediction prize4life challenge, the dream als stratification prize4life challenge. External Links: Link Cited by: §2.2.
  • D. Ribes and C. P. Lee (2010) Sociotechnical studies of cyberinfrastructure and e-research: current themes and future trajectories. Computer Supported Cooperative Work (CSCW) 19 (3-4), pp. 231–244. Cited by: §5.2.
  • B. Rolland and C. P. Lee (2013) Beyond trust and reliability: reusing data in collaborative cancer epidemiology research. In Proceedings of the 2013 conference on Computer supported cooperative work, pp. 435–444. Cited by: §1.
  • A. Rule, A. Tabard, and J. D. Hollan (2018) Exploration and explanation in computational notebooks. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 32. Cited by: §2.2.
  • R. Schroeder (2007) E-research infrastructures and open science: towards a new system of knowledge production?. Prometheus 25 (1), pp. 1–17. Cited by: §2.2.
  • A. Shamekhi, Q. V. Liao, D. Wang, R. K. Bellamy, and T. Erickson (2018) Face value? exploring the effects of embodiment for a group facilitation agent. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 391. Cited by: §5.4.
  • K. Sijtsma (2016) Playing with data or how to discourage questionable research practices and stimulate researchers to do things right. Psychometrika 81 (1), pp. 1–15. Cited by: §5.1.2.
  • B. Spencer Jr, R. Butler, K. Ricker, D. Marcusiu, T. A. Finholt, I. Foster, C. Kesselman, and J. P. Birnholtz (2008) 18 neesgrid: lessons learned for future cyberinfrastructure development. Scientific Collaboration on the Internet, pp. 331. Cited by: §1.
  • J. Staddon (2017) Scientific method: how science works, fails to work, and pretends to work. Routledge. Cited by: §5.4.
  • G. Stasser and W. Titus (1985) Pooling of unshared information in group decision making: biased information sampling during discussion.. Journal of personality and social psychology 48 (6), pp. 1467. Cited by: §2.3, §5.1.2.
  • D. D. Stewart and G. Stasser (1998) The sampling of critical, unshared information in decision-making groups: the role of an informed minority. European Journal of Social Psychology 28 (1), pp. 95–113. Cited by: §2.3.
  • H. Tan, D. Wang, and S. Sabanovic (2018) Projecting life onto robots: the effects of cultural factors and design type on multi-level evaluations of robot anthropomorphism. In 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), pp. 129–136. Cited by: §5.4.
  • J. Thackara (2000) Edge effects: the design challenge of the pervasive interface. In CHI’00 Extended Abstracts on Human Factors in Computing Systems, pp. 199–200. Cited by: §2.3.1.
  • Trifacta (2019) Trifacta. External Links: Link Cited by: §2.2.
  • A. Tversky and D. Kahneman (1974)

    Judgment under uncertainty: heuristics and biases

    Science 185 (4157), pp. 1124–1131. Cited by: §5.4.
  • D. Van Knippenberg and M. C. Schippers (2007) Work group diversity. Annual review of psychology 58. Cited by: §2.3.
  • J. Vanschoren, J. N. Van Rijn, B. Bischl, and L. Torgo (2014) OpenML: networked science in machine learning. ACM SIGKDD Explorations Newsletter 15 (2), pp. 49–60. Cited by: §2.2.
  • T. Velden (2013) Explaining field differences in openness and sharing in scientific communities. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work, pp. 445–458. Cited by: §1.
  • R. Vicente-Sáez and C. Martínez-Fuentes (2018) Open science now: a systematic literature review for an integrated definition. Journal of business research 88, pp. 428–436. Cited by: §1.
  • D. Wang, J. S. Olson, J. Zhang, T. Nguyen, and G. M. Olson (2015) DocuViz: visualizing collaborative writing. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 1865–1874. Cited by: §5.2.
  • D. Wang, H. Tan, and T. Lu (2017) Why users do not want to write together when they are writing together: users’ rationales for today’s collaborative writing practices. Proceedings of the ACM on Human-Computer Interaction 1 (CSCW), pp. 107. Cited by: §5.1.1.
  • D. Wang, H. Wang, M. Yu, Z. Ashktorab, and M. Tan (2019) Slack channels ecology in enterprises: how employees collaborate through group chat. arXiv preprint arXiv:1906.01756. Cited by: §5.1.1.
  • D. Wang (2016) How people write together now: exploring and supporting today’s computer-supported collaborative writing. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion, pp. 175–179. Cited by: §5.1.1.
  • A. Warr (2006) Situated and distributed design. In NordiCHI Workshop on Distributed Participatory Design, Oslo, Norway, Cited by: §2.3.1.
  • E. Wenger (2010) Communities of practice and social learning systems: the career of a concept. In Social learning systems and communities of practice, pp. 179–198. Cited by: §5.1.1.
  • M. Woelfle, P. Olliaro, and M. H. Todd (2011) Open science is a research accelerator. Nature Chemistry 3 (10), pp. 745. Cited by: §1.
  • W. A. Wulf (1989) The national collaboratory-a white paper. Cited by: §1.
  • J. Zhang (2018)

    JupyterLab_Voyager: a data visualization enhancement in jupyterlab

    Cited by: §2.2, §2.2.