Slack Channels Ecology in Enterprises: How Employees Collaborate Through Group Chat

06/04/2019
by   Dakuo Wang, et al.
ibm
0

Despite the long history of studying instant messaging usage in organizations, we know very little about how today's people participate in group chat channels and interact with others. In this short note, we aim to update the existing knowledge on how group chat is used in the context of today's organizations. We have the privilege of collecting a total of 4300 publicly available group chat channels in Slack from an R&D department in a multinational IT company. Through qualitative coding of 100 channels, we identified 9 channel categories such as project based channels and event channels. We further defined a feature metric with 21 features to depict the group communication style for these group chat channels, with which we successfully trained a machine learning model that can automatically classify a given group channel into one of the 9 categories. In addition, we illustrated how these communication metrics could be used for analyzing teams' collaboration activities. We focused on 117 project teams as we have their performance data, and further collected 54 out of the 117 teams' Slack group data and generated the communication style metrics for each of them. With these data, we are able to build a regression model to reveal the relationship between these group communication styles and one indicator of the project team performance.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

09/21/2020

Identifying the Mood of a Software Development Team by Analyzing Text-Based Communication in Chats with Machine Learning

Software development encompasses many collaborative tasks in which usual...
02/09/2018

Understanding Chatbot-mediated Task Management

Effective task management is essential to successful team collaboration....
04/26/2022

Socio-technical constraints and affordances of virtual collaboration – A study of four online hackathons

Hackathons and similar time-bounded events have become a popular form of...
11/13/2018

On the Polarization Levels of Automorphic-Symmetric Channels

It is known that if an Abelian group operation is used in an Arıkan-styl...
07/10/2019

Dynamics of Team Library Adoptions: An Exploration of GitHub Commit Logs

When a group of people strives to understand new information, struggle e...
05/26/2021

A Theory of Scrum Team Effectiveness

Scrum teams are the most important drivers to lead an Agile project to i...
02/10/2020

Building Implicit Vector Representations of Individual Coding Style

With the goal of facilitating team collaboration, we propose a new appro...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction and Background

Instant messaging systems (IM) emerged in 1980s then slowly being adopted by individuals as well as organizations in the past few decades. With years of research and development effort in CSCW (e.g., (Bradner et al., 1999; Erickson et al., 1999; Nardi et al., 2000; Dourish and Bellotti, 1992; Olson and Olson, 2000; McDaniel et al., 1996)

), modern IMs (e.g., Slack and Skype) are much more advanced. More and more companies and organizations are taking these IMs as granted that their workers exchange information and files so efficiently. Slack is one such tool that has been adopted by offices and work spaces. Comparing to the previous generation of IMs used in workplaces (e.g, Lotus Sametime), it emphasizes more on the group chat feature and provides a much better user experiences for multi-parties chats. It is estimated that Slack reduces emails by 32% and reduce meetings by 23%, it also helps new employees to reach full productivity 25% sooner 

(Slack, 2018).

In parallel to the recent advancement of IM technologies, text-based AI programs (i.e., Chatbots) reside in IMs has attracted many interests from researchers and business. Chatbots are often built based on a language model that can simulate how human chat with another human. Many HCI researchers and natural language processing (NLP) researchers have invested tons of efforts in building better chatbots, but often failed to do so

(Boutin and Boutin, 2017). Partially because many of such efforts have been based on a mis-understanding of how people chat with each other and the communicate styles ( e.g.,  (Jiang et al., 2018; Aoki et al., 2006) assumed that a regular conversation thread has more than 10 turns and less than 100 turns). Thus, there is an urgent need to update the existing understanding of how people are chatting in groups and in a finer-grained level, such as how many conversation threads with different topics are incurred in a group chat channel and how deep each thread is.

Another dimension of fine-grain understanding is the group chat ecosystem. Often time AI researchers focus only on a single context of group chats. For example, the famous human-human group chat datasets being used by NLP researcher are Ubuntu-IRC  (Elsner and Charniak, 2010) and the Twitch dataset  (Hamilton et al., 2014), which focus on IT development and Gaming thus has limited generalizability. Another group of research relies on synthetic datasets often generated from the Reddit (Wang et al., 2019) or StackOverflow, but these synthetic datasets from online forums do not guarantee the similarity in communication styles in the real IMs datasets.

To address these challenges, we focus on a total number of 4,300 group chat channels being created and used by  8,000 employees in a R&D division in a big IT company. We collect both the meta data and the raw messages of these channels spanning from Mar 2016 (Slack was first introduced) to Mar 2019 (this Paper was written). To better understand the group chat ecology, we first randomly selected 100 group channels and manually coded them into 9 different categories. Among these categories, we dive into one specific kind of channels, Project-based groups, and cross reference the project team performance dataset to investigate how communication styles in a group’s Slack channel may predict the project team’s performance. The rest of the paper is organized as follow, we first review the history of IMs research in CSCW and particularly focus on the recent years HCI and NLP research on modeling human-conversation patterns or building multi-turn chatbots in groups. After presenting the research methods and dataset, we slit the result section into two sub-sections, one reporting the Communication Style of 9 manually-coded categories, and the other one reporting the Communication Style’s influence on team performance. We conclude the paper with discussion and limitations.

2. Related Work

2.1. Instant Messaging Systems and Research in CSCW

There are decades of CSCW literature on IM system designs and user studies of it that we do not have space to cover all of them. We only want to point out one particular genre of research on IMs that is around multiple-parties group chat. Back in 1994, McDaniel et al. compared how people chat with each other in Face-to-Face setting (FTF) versus in text-based Computer-Medicated Communication (CMC) systems, which is the early name of IMs and Videoconferencing systems (McDaniel et al., 1996). They found that a group of people chat on multiple concurrent threads in a conversation (2 to 3 for FTF and 4 to 6 for CMC) and the threads have a different timespan (2.8 mins for FTF and 23.3 mins for CMC). These results are intriguing but the data corpus and the analysis method they used were quit primitive: they analyzed 6 chatting groups and labeled the timestamp for each message, put them on the same timeline, and manually counted the threads and the numbers.

Another notable research work about early days IMs usage is from Bonnie Nardi and Steve Whittaker  (Nardi et al., 2000) in 2000. They particularly focused on the early adopters of an IM system in a workplace setting and reported how they used or would like to use IMs. For example, they reported that users use IMs not only for work-related question and answering type of activities (”Interaction”), they also use it for informal purposes such as checking whether someone is available for a chat or not. And they also reported that those early-days IM systems were designed based on a dyadic ”call” model, where users use it more like a phone to find another individual to chat, but preferred not to use the ”chat room” type of features.

Together with other research effort, the design implications stemmed from these research findings have guided the following two decades of IM system design. E.g., having a status feature and lacking a groupchat feature in workplace IMs. Now that new tools exist and people are using them, it is time for us to revisit this research topic after  20 years.

2.2. Communication Patterns Research in HCI and in NLP

In addition to the HCI research effort of understanding human chat behavior, NLP researchers have also investigated how to advance techniques to automatically analyze text-based human conversations. However, most of these NLP researches focus on the one-on-one conversation scenarios (i.e. conversation between a chatbot and a human user), on which information retrieval traditional methods, syntactic/semantic parsing techniques, and neural sequence-to-sequence generation models are integrated into the chatbot (Higashinaka et al., 2014; Yan et al., 2016; Zhou et al., 2018). For the domain of analyzing group chat conversations, there are a few works on disentangling interleaved conversational threads to form threads discussing single topics (Elsner and Charniak, 2010; Mayfield et al., 2012; Jiang et al., 2018) and extracting knowledge from conversational dialogues (Hixon et al., 2015). These works have limitations in the sense that they neglect the richness in conversation patterns in different conversational categories. For example, the work (Jiang et al., 2018) was conducted on particular interest-based Reddit forums; and the work (Hixon et al., 2015) focused only on educational related topics, thus they have limited generalizability to other domains or contexts.

This work focuses on the workplace group chat and we believe better categorization of conversation groups and more careful analysis of the characteristics of different conversation group categories are necessary for the followup NLP tasks or CSCW system building tasks.

2.3. Communication Styles Affecting Team Performance

Another implication of extracting communication styles for groups is that we may be use it to predict other team-oriented behaviors or even the performances of the team. There has been extensive research on this topic. For example, Zhang et al. applied topic models analysis on chat messages to investigate the evolution of team dynamics over a long term project (Zhang et al., 2018). They describe common behaviors and team cohesion dynamics. It is well known in the CSCW community that people who work on teams may not put as much effort than if they were working individually (i.e., ”Slackers”) (Karau and Williams, 1993). Furthermore, coordinating individuals’ contributions through communication is challenging and critical (Diehl and Stroebe, 1987), and many CSCW systems have been proposed to address some of these challenges (Kraut, 2003). In this current work, we follow the trend of adopting machine learning methods to extract group behaviors and then to predict the group performance.

3. Methods

In this section, we describe our datasets, the open coding analysis method we used to identify the 9 categories of group chat channels, the machine learning methods that we used to pre-process the datasets and extract the feature metrics, and the regression method that we used to analyze the relationship between the features metrics and the team performance in a subset of 54 project team’s dataset.

3.1. Datasets

In total, there are two datasets being used in this study, a Slack message dataset with 4300 public channels in an R&D department in a company, and a dataset with 117 project teams with the team composition and performance information.

For the 4300 channels dataset, we randomly select 100 channels as a subset for further manual coding analysis. For the 117 project team performance dataset, we cross-reference with the 4300 project channels and identified that 54 project teams have a designated and publicly-accessible group chat channel, thus we crawled the messages of the 54 teams, and prepared a sub-dataset for the 54 project teams with both the project performance and its Slack group chat channel messages.

Here, we would like to provide a bit more details about the Project Team performance dataset. In this R&D department in the multinational IT company, a re-organization occurred in November 2017 and 117 project teams (not the same as organization teams) were formed. For a few months period after November 2017, these teams all were encouraged to submit papers to an academic conference as their primary goal, thus 146 submissions were generated by the conference submission deadline.

Leverage on an internal project management portal, we collected information about the project team, such as the project description and team member information. We use whether a project team generates a paper submission as the final outcome to reflect its performance. If there are one or more submissions to the conference, we denote 1 to the outcome variable, otherwise 0.

We acknowledge that this way of describing team performance has many limitations and we will elaborate in the limitation section by the end of the paper. In addition to the project performance, each of the project team is also required to have a Slack group chat channel, but many of those channels and code repositories are private to the team members due to confidentiality purpose. At the end, we are only able to collect 54 Slack channels for the 117 groups, and this serves as the second dataset.

3.2. Manual Coding Slack Group Chat Channels to Identify Categories

We observed that the group chat channels can vary tremendously in the number of members or some other characteristics. Thus, we decided to first conduct a qualitative analysis to identify the different types of slack groups. Since it is difficult to code all the 4,300 channels, we randomly selected 100 channels for manually labeling the categories. We were prepared to code more channels if new categories keep emerging. The result of the 9 categories suggested that our code has reached the saturated thus we stopped. In particular, two authors of this paper independently conducted thematic content analysis for each of the 100 Slack channels by reading the content of the Slack group, and various meta-data such as channel description and the number of members. Independently the two authors coded each of the Slack groups and took notes why they believed so. Then, the two authors discussed their notes and coding schema (without revealing the code for each channel), and finalized a list of 9 categories (see Table 1). The two authors then re-coded the 100 slack channels with the agreed code list for a cohen kappa score of 0.8259668.

Channel Category N Description
Project 32 This category of channels consists with discussions around projects
Social Group 10 This category involves discussions on non-work related social activities
IT Support 8 This category of channels often being used as a help desk for internal systems or tools
Employee Support 3 The channels being used as a help desk for answer HR or other logistics related questions
Tech Enthusiasts 10 Often consists with a group of members discussing a new technology that are not necessarily related to their work project
Event 9 Schedule, plan and discussion around a temporary event at work
Bot 8 A type of channels where most of the messages are from a Slack bot, often being used to monitor a particular system’s maintenance log
Test 18 Users mistakenly create a group channel or tested the creation of channel function, often no message is posted
Announcement 2 A channel for department or even organizational level announcement
Table 1. The 9 Categories of 100 Randomly Selected Slack Group Chat Channels Identified by Manual Coding, Each with a Short Description

3.3. Feature Metrics to Represent Communication Styles of a Slack Group Chat Channel

As shown in Table 2, we generated 21 features to represent a group chat channel.

Many of the features are self-explanatory thus here we focus only on the confusing ones to elaborate.

The #members is the number of people who have joined in the channel and #active_users represents the number of people with at least one message in the channel. Active members could be larger than current members because people may leave the channel after they post messages.

#active_timespan captures the number of days from the day that the first message was posted in the channel to the day that the last message was posted. word_per_message is an average number words in each message. Sometimes there might be a single user dominated the whole group channel that we also measure the max number of messages by one user.

A very common interactive function in Slack is to notify a specific user by @username or the entire group by @channel or @here. We thus include the corresponded number of messages for those three different types of messages as #at_messages, #channel_messages, and #here_messages. A more complicated interaction is that many people tend to explicitly form a “thread” in a Slack channel for a small discussion on one topic, we generate feature #threads to represent the number of threads within the channel. We also have max_turn_thread and avg_turn_thread to capture the maximum and average number of turns in a thread. Another common used function in Slack is to “react” to a specific message by a simple emoji. We have feature max_count_reaction and avg_count_reaction for maximum and average number of reactions for a message.

#emoji_messages will capture the number of messages with emojis within it. The #pinned_messages represents the number of messages “pinned” by the users in the channel which they feel is important. Since people may share code snippets in a channel, we introduce #code_messages for the number of messages with code snippets.

We also have #url_messages and #git_messages to represent the number of messages which contains an URL and the number of messages which contains an URL specific to a Github page. People are also able to easily share files within a channel, we also capture such behavior by having #file_messages to capture the number of messages contains file sharing. As for Slack, the owner of the channel could introduce a “Slack bot” which interacts with people in different ways such as general question answering bot, alert notification bot, etc. We also count the number of messages generated by the Slack bots as #bot_messages.

Feature Definition Project N=32 Social Group N=10 IT Support N=8
#current_members #current members in the channel 11.1 84.0 177.0
#active_members #members with at least one message 14.4 92.7 139.9
active_timespan #days between first message and last message 281.3 478.6 506.5
#messages #messages in total in this Slack channel 355.9 3059.0 4537.4
word_per_message avg length of message in #words 16.4 12.8 23.0
#message_top_user the max #messages posted by one user 90.2 554.9 1819.9
#at_messages #messages with @user mentions 79.1 337.0 953.3
#channel_messages #messages with @channel broadcast mention 0.0 0.1 0.1
#here_messages #messages with @here broadcast mention 0.1 8.1 1.6
#threads #threads in the channel 15.3 152.0 185.8
max_turn_thread max #messages in a thread 6.7 28.6 34.9
avg_turn_thread avg #messages in a thread 2.0 2.18 4.1
#reacted_messages #messages with reaction emoji 19.5 549.0 155.0
avg_count_reaction avg #reactions in a reacted message 0.9 2.2 1.5
#emoji_messages #messages with emoji 11.9 284.7 102.4
#pinned_messages #pinned messages 0.44 1.9 5.5
#code_messages #messages with code segments 6.5 1.4 81.8
#url_messages #messages with URL 34.2 188.0 329.5
#git_messages #messages with Github URL 8.7 0.7 59.6
#file_messages #messages with Files 60.4 180.5 58.9
#bot_messages #messages sent by Slack chatbot 52.1 64.6 1396.5
Table 2. List of 21 Features Depicting Slack Group Chat Channels and Exemplar Metrics of Three Categories

3.4. Machine Learning Model to Automatically Identify Categories

In order to verify the effectiveness of the feature representation for each group, we conduct an automatic Slack group classification task using all features. After extracting the feature vector for each labeled slack channel, we build an ensemble tree-based classifier from

(Geurts et al., 2006)

to predict the category of each channel. The rationales of choosing this model are two folds: first, given our features, it is straightforward for a decision-tree based algorithm to learn good rules that are non-linear combinations of features; second, the number of coded Slack groups are limited and an ensemble-based approach could help prevent over-fitting.

As for the evaluation process, we are focusing on the overall classification accuracy as well as precision and recall for each label. Due to the limited coded channels, we follow the leaving-one-out cross-validation method from

(Kearns and Ron, 1999) to measure the overall classification accuracy, which is widely used for model evaluation on small data set.

Worth to mention that the machine learning algorithm works better if we feed in more features. Thus, for each of the count-based features, we also compute three normalized features for each by dividing #messages, #active_users and active_timespan. In total, we end up having 60 features as representation of each slack group in the machine learning model.

3.5. Recursive Logistic Regression Model Method to Examine Relationship between a project team’s Communication Style in Slack and Performance

We conducted a logistic regression on the dataset with 54 project team slack channel features and the binary publication response variable. Among the 54 project teams in our dataset, 35 had published papers and 19 had not published. For our explanatory variables, we used the features extracted from conversations in these teams’ slack channels, in Table

2. We also used recursive feature elimination algorithm (Yan and Zhang, 2015) to identify the best features in the model that were the most predictive.

4. Results

4.1. Communication Styles for the 9 Categories

Through the qualitative coding of 100 randomly sampled channels, we were able to identify 9 different categories. The code names and the descriptions are all as listed in Table 1, thus we would not repeat here.

Based on the manual coding of 100 data points, the machine learning algorithm (Geurts et al., 2006) can also pick up the difference between categories. Through our experiment, the machine learning model for identifying different categories could achieve 66% overall accuracy on the coded 100 channels. As for Project category, we could get precision 79.4% and Recall 87.1%, which we believe is reasonable good for our downstream task (Section 4.2) on this specific category. We could also get reasonable performance for other labels except for Employee Support and Announcement, simply because we have only 2-3 data points for each of those categories.

By examining the features for each category, we notice that there are several different communication styles. For example, some categories have more informal communications while several other categories have more technical communications.

Within table 2, we provide the average value for each feature of three different categories111Please refer to Appendix Table 4 for features of all categories including Project, Social, and IT Support . It is quite intuitive to see the differences in some of characteristics of each category by looking at several representative features:

  • Project Category: From the results in table 2, we can find that a Project slack channel usually has fewer messages (355.9) and fewer number of members (11.1) compared to the other two groups which have more than 3,000 messages on average and about 100 members. We believe this is natural for a Project Slack channel, since it consists of a few people working on the project that form a centralized communication. Even the total number of messages is limited as they may communicate locally as well, the percentage of #file_messages are substantially higher than the other categories as we believe members in such channels tend to share files for productive collaboration on a project.

    If we examine active_timespan of this category, we could also notice that the active time span is shorter than the other two categories because of a project is supposed to finish within a period. We also notice similar characteristic for Event group where the active time span is even shorter (144 days), as people tend to quickly form a slack group discussion for a specific event and then become inactive as the event finishes.

  • Social Group Category: A Social group usually has a large number of messages, where the amount of messages is similar comparing to the IT Support group we will discuss below. But the Social groups could be differentiated from the other types of groups, including the IT Support groups, based on features like #emoji_messages and #reacted_messages.

    For this category, we could observe more frequent usages of emojis (284.7) because of the more casual style communication in such channels. Similarly, we notice that for Bot channels, the emojis are also widely used as people tend to build the Slack bots with emojis in the conversation to make the bots more friendly. For the Social groups, another feature with higher values than the others is the #reacted_messages, which is the number of messages with emoji reactions like thumbs-up and thumbs-down. This is similar to the number of emoji messages as people tend to form a casual communication style. Members in this channel also tend to use @here more frequently than in other channels. We hypothesis the reason behind is that people wish to eagerly share content with all other members.

  • IT Support Category: As for IT Support category, it usually involves messages trying to solve a specific technical issue so that the number of messages containing code snippet (81.8) is significantly higher than the other two groups which have less than 10 such messages.

    There are some other features which could help differentiated this category. We notice that the averaged number of turns in threaded messages (4.1) for this category is significantly higher than all the other categories. We believe the reason is that people need multiple turns of conversation in order to solve an technical issue. If we look at the feature #message_top_user, we could notice that the value for IT Support (1819.9) is substantially higher than others and we think it suggests that the user behind is the one who actively provides solutions to most of the IT problems. As for #bot_messages, this type of channel has more messages sent by Slack bot. This is in accordance with our findings that many IT Support channels are using Slack bots to handle some basic questions and frequently asked questions (FAQ) with regard to service status or product update. We also notice a significant higher usage (953.3) of @username and we believe it is for people to tag specific users to solve a specific technical issue.

4.2. What Features in Communication Predict Better Team Collaboration

The recursive logistic regression method generates the following features for predicting whether the team has a submission or not: active timespan in days, number of bot messages, number of @here messages, normalized messages for the top user, normalized threads, normalized emoji, and normalized code. These features yielded an of 0.61. The results of our logistic regression are in Table 3. Below we briefly discuss these features.

Feature B S.E z Pr¿—z—
Active Time Span 0.0023 0.002 1.446 0.148
Number of Bot Messages 20.9500 5.04e+04 0 1.0
Normalized Messages Per Top User 21.3561 6.07e+04 0 1.0
Normalized Threads 0.0099 0.011 0.891 0.373
Normalized Code -1.2670 1.442 -0.879 0.380
Normalized Emoji -0.1345 0.228 -0.589 0.556
Normalized @Here -1.4987 5.136 0.292 0.770
Table 3. Logistic Regression Result Of 7 Features Lead to Team Performance

Active time span is measure by the days span from the very initial message to the last message. As members work together longer time, they are more likely to have a better output (marginal significant). We found that the number of bot messages is correlated (but not significantly) to success outcome. As the most used bots across these challenges are the Github bot, it may represent the team is more active in Github related activites that has a better outcome.

Other features are easy to understand: the result suggests that if there is a single user publishes a lot of the messages in a channel, the more conversation threads created in the channel, the more programming codes, the more emojis, and the more here messages are used, the more likely the team has a paper submission.

5. Discussion

In this section, we describe the the implications of our results. We contribute to updating the understanding of communication styles and in various context-specific categories. We trained a machine learning model (will be released together with the paper) that can automatically categorize Slack group channels in workplace at a reasonably high accuracy. By looking closely at one category (Project category), we further discuss how this communication style feature metric could be useful for predicting the project team’s performance.

5.1. Updating The Understanding of Human-Human Conversation Pattern

Through our manual coding, we are able to identify 9 different categories. Based on these data points and the feature metrics we defined, we also built a ML model to identify category. We suggest the HCI and NLP researchers who are doing content analysis on group chat message data corpus should first use our approach to categorize their group chat channels before conducting any downstream tasks. The current practices of treating all types of group channels equally and reporting one single average for all the categories is misleading. Also, for researchers who are building synthetic datasets for group chats, they should consider the particular domain and the context that their target audience is in, and refer to our Table 4 to construct the data corpus.

Our results could also be used as a guideline for researchers to build chatbot. For different categories of channels, the chatbot should behave differently. For example, when building Chatbot for a social group, one should consider to make it more casual style for more engagement. As for a Chatbot for an IT Support channel, one should make it adaptable for multi-turn problem solving communication style.

We also demonstrated that this feature metrics could be use to predict team performance. Though not every feature is significant in the regression model, the goodness of the full model () hints the promising future of this line of research. If we can build a dashboard or a system that actively track the group conversations in their team channels of a project team, we may be able to have a real time meter for the program team’s performance, and early sign of project failure could be detected.

5.2. Limitations

Our study has a couple limitations. First, the context is within a R&D department of a multinational IT company, and we use a publication as a proxy for success in this study. It is important to note that success can be defined in other ways in projects (patents, product impact). However, within the context of this time-bounded re-organization even in the R&D department, it is sufficient to use publications as a proxy for success. Thus, the reader should be warned that some of the results from this study, such as what communication styles lead to higher group performance, may not be generalizable to other contexts.

Secondly, the pre-trained machine learning model for category identification may not generalize well for group chat dataset other than Slack, and there will be a need for retraining of the model on the new dataset to fit for a different feature distribution.

6. Conclusion

In this short note, we provide a comprehensive set of three analyses on understanding communication styles in Slack group chat channels in today’s workplace settings. We manually coded 9 different categories of group categories, and defined a communication style metric with 21 features. Based on those features, we built a machine learning model that can automatically categorize group chat channels. Further we illustrated that the 21 features metric could be used to unveil the relation between communication styles and the success of a project.

Acknowledgements.
Blinded for Review.

References

  • (1)
  • Aoki et al. (2006) Paul M Aoki, Margaret H Szymanski, Luke Plurkowski, James D Thornton, Allison Woodruff, and Weilie Yi. 2006. Where’s the party in multi-party?: Analyzing the structure of small-group sociable talk. In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work. ACM, 393–402.
  • Boutin and Boutin (2017) Paul Boutin and Paul Boutin. 2017. Why Most Chatbots Fail. https://chatbotsmagazine.com/why-most-chatbots-fail-1c085b74d6ad
  • Bradner et al. (1999) Erin Bradner, Wendy A Kellogg, and Thomas Erickson. 1999. The adoption and use of ‘Babble’: A field study of chat in the workplace. In ECSCW’99. Springer, 139–158.
  • Diehl and Stroebe (1987) Michael Diehl and Wolfgang Stroebe. 1987. Productivity loss in brainstorming groups: Toward the solution of a riddle. Journal of personality and social psychology 53, 3 (1987), 497.
  • Dourish and Bellotti (1992) Paul Dourish and Victoria Bellotti. 1992. Awareness and coordination in shared workspaces.. In CSCW, Vol. 92. 107–114.
  • Elsner and Charniak (2010) Micha Elsner and Eugene Charniak. 2010. Disentangling chat. Computational Linguistics 36, 3 (2010), 389–409.
  • Erickson et al. (1999) Thomas Erickson, David N Smith, Wendy A Kellogg, Mark Laff, John T Richards, and Erin Bradner. 1999. Socially translucent systems: social proxies, persistent conversation, and the design of “babble”. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 72–79.
  • Geurts et al. (2006) Pierre Geurts, Damien Ernst, and Louis Wehenkel. 2006. Extremely randomized trees. Machine learning 63, 1 (2006), 3–42.
  • Hamilton et al. (2014) William A Hamilton, Oliver Garretson, and Andruid Kerne. 2014. Streaming on twitch: fostering participatory communities of play within live mixed media. In Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, 1315–1324.
  • Higashinaka et al. (2014) Ryuichiro Higashinaka, Kenji Imamura, Toyomi Meguro, Chiaki Miyazaki, Nozomi Kobayashi, Hiroaki Sugiyama, Toru Hirano, Toshiro Makino, and Yoshihiro Matsuo. 2014. Towards an open-domain conversational system fully based on natural language processing. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 928–939.
  • Hixon et al. (2015) Ben Hixon, Peter Clark, and Hannaneh Hajishirzi. 2015.

    Learning knowledge graphs for question answering through conversational dialog. In

    Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 851–861.
  • Jiang et al. (2018) Jyun-Yu Jiang, Francine Chen, Yan-Ying Chen, and Wei Wang. 2018. Learning to disentangle interleaved conversational threads with a siamese hierarchical network and similarity ranking. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Vol. 1. 1812–1822.
  • Karau and Williams (1993) Steven J Karau and Kipling D Williams. 1993. Social loafing: A meta-analytic review and theoretical integration. Journal of personality and social psychology 65, 4 (1993), 681.
  • Kearns and Ron (1999) Michael Kearns and Dana Ron. 1999. Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural computation 11, 6 (1999), 1427–1453.
  • Kraut (2003) Robert E Kraut. 2003. Applying social psychological theory to the problems of group work. HCI models, theories and frameworks: Toward a multidisciplinary science (2003), 325–356.
  • Mayfield et al. (2012) Elijah Mayfield, David Adamson, and Carolyn Penstein Rosé. 2012. Hierarchical conversation structure prediction in multi-party chat. In Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, 60–69.
  • McDaniel et al. (1996) Susan E McDaniel, Gary M Olson, and Joseph C Magee. 1996. Identifying and analyzing multiple threads in computer-mediated and face-to-face conversations. In Proceedings of the 1996 ACM conference on Computer supported cooperative work. ACM, 39–47.
  • Nardi et al. (2000) Bonnie A Nardi, Steve Whittaker, and Erin Bradner. 2000. Interaction and outeraction: instant messaging in action. In Proceedings of the 2000 ACM conference on Computer supported cooperative work. ACM, 79–88.
  • Olson and Olson (2000) Gary M Olson and Judith S Olson. 2000. Distance matters. Human–computer interaction 15, 2-3 (2000), 139–178.
  • Slack (2018) Slack. 2018. The Business Value of Slack. Retrieved April 3, 2019 from https://a.slack-edge.com/eaf4e/marketing/downloads/resources/IDC_The_Business_Value_of_Slack.pdf
  • Wang et al. (2019) Haoyu Wang, Ming Tan, Mo Yu, Shiyu Chang, Dakuo Wang, Kun Xu, Xiaoxiao Guo, and Saloni Potdar. 2019. Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers. arXiv preprint arXiv:1902.01030 (2019).
  • Yan and Zhang (2015) Ke Yan and David Zhang. 2015. Feature selection and analysis on correlated gas sensor data with recursive feature elimination. Sensors and Actuators B: Chemical 212 (2015), 353–363.
  • Yan et al. (2016) Zhao Yan, Nan Duan, Junwei Bao, Peng Chen, Ming Zhou, Zhoujun Li, and Jianshe Zhou. 2016. Docchat: An information retrieval approach for chatbot engines using unstructured documents. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 516–525.
  • Zhang et al. (2018) Yanxia Zhang, Jeffrey Olenick, Chu-Hsiang Chang, Steve WJ Kozlowski, and Hayley Hung. 2018. The I in Team: Mining Personal Social Interaction Routine with Topic Models from Long-Term Team Data. In 23rd International Conference on Intelligent User Interfaces. ACM, 421–426.
  • Zhou et al. (2018) Li Zhou, Jianfeng Gao, Di Li, and Heung-Yeung Shum. 2018. The Design and Implementation of XiaoIce, an Empathetic Social Chatbot. arXiv preprint arXiv:1812.08989 (2018).

Appendix A Appendix A: The Feature Metrics For The 9 Categories

Feature Anno- uncement N=2 Bot N=8 Emplo- yee Support N=3 Event N=9 IT Support N=8 Project N=32 Social Group N=10 Tech Enthusiasts N=10 Test N=18
#members 937.0 60.0 78.3 32.8 177.0 11.1 84.0 87.6 2.7
#active_users 117.0 72.5 70.3 34.1 139.9 14.4 92.7 70.8 5.4
active_timespan 633.5 443.8 290.7 144.1 506.5 281.2 478.6 411.9 200.7
#messages 390.0 5908.1 267.7 54.2 4537.4 355.9 3059.0 659.3 7.9
word_per_message 20.7 18.5 20.8 9.8 23.0 16.4 12.8 17.1 5.8
#messages_top_user 63.5 285.9 52.0 10.4 1819.9 90.2 554.9 83.7 2.4
#at_messages 209.0 305.6 108.7 39.2 953.2 79.2 337.6 217.7 6.0
#channel_messages 0.0 0.5 0.0 0.0 0.1 0.0 0.1 0.2 0.0
#here_messages 0.0 47.5 0.7 0.1 1.6 0.2 8.1 0.3 0.0
#thread_messages 75.5 42.4 15.3 1.4 185.8 15.3 152.5 53.4 0.1
max_turn_thread 6.5 4.1 6.3 1.2 34.9 6.7 28.6 9.6 0.1
avg_turn_thread 1.3 0.4 2.8 0.8 4.1 2.0 2.8 2.2 0.1
max_reaction_count 15.5 4.2 9.3 2.9 9.6 2.3 13.6 9.1 0.1
avg_reaction_count 4.5 0.5 2.2 1.5 1.5 0.9 2.2 1.5 0.1
#pinned_messages 3.0 3.2 1.0 1.0 5.5 0.4 1.9 2.3 0.0
#emoji_messages 13.5 284.8 20.0 1.8 102.4 12.0 284.7 22.9 0.3
#code_messages 0.0 0.4 0.3 0.2 81.8 6.6 1.4 2.6 0.0
#url_messages 14.5 559.8 57.0 6.1 329.5 34.2 188.8 80.0 0.2
#git_messages 0.5 0.5 0.0 0.1 59.6 9.0 0.7 20.6 0.0
#file_messages 7.0 3345.0 17.3 2.1 58.9 60.4 180.5 25.1 0.1
#bot_messages 0.0 3477.0 2.0 0.0 1396.5 52.2 64.6 0.3 0.1
Table 4. The Feature Metrics For The 9 Categories Identified by Manually Coding 100 Channels Randomly Selected from 4300 Public Slack Channels in A Company