Jump on the Bandwagon? – Characterizing Bandwagon Phenomenon in Online NBA Fan Communities

by   Yichen Wang, et al.
University of Colorado Boulder

Understanding user dynamics in online communities has become an active research topic and can provide valuable insights for human behavior analysis and community management. In this work, we investigate the "bandwagon fan" phenomenon, a special case of user dynamics, to provide a large-scale characterization of online fan loyalty in the context of professional sports teams. We leverage the existing structure of NBA-related discussion forums on Reddit, investigate the general bandwagon patterns, and trace the behavior of bandwagon fans to capture latent behavioral characteristics. We observe that better teams attract more bandwagon fans, but they do not necessarily come from weak teams. Our analysis of bandwagon fan flow also shows different trends for different teams, as the playoff season progresses. Furthermore, we compare bandwagon users with non-bandwagon users in terms of their activity and language usage. We find that bandwagon users write shorter comments but receive better feedback, and use words that show less attachment to their affiliated teams. Our observations allow for more effective identification of bandwagon users and prediction of users' future bandwagon behavior in a season, as demonstrated by the significant improvement over the baseline method in our evaluation results.



There are no comments yet.


page 1

page 2

page 3

page 4


Antisocial Behavior in Online Discussion Communities

User contributions in the form of posts, comments, and votes are essenti...

Drifts and Shifts: Characterizing the Evolution of Users Interests on Reddit

Selective exposure is the main driver for the economy of attention when ...

Towards Evaluating Exploratory Model Building Process with AutoML Systems

The use of Automated Machine Learning (AutoML) systems are highly open-e...

All Who Wander: On the Prevalence and Characteristics of Multi-community Engagement

Although analyzing user behavior within individual communities is an act...

An Army of Me: Sockpuppets in Online Discussion Communities

In online discussion communities, users can interact and share informati...

Characterizing and Predicting Email Deferral Behavior

Email triage involves going through unhandled emails and deciding what t...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The proliferation of online social networks has led to the increasing popularity of online communities, which allow like-minded people to connect and discuss their shared interests, values, and goals without geographic constraints. For instance, in Reddit, a popular online community platform, users can participate in a large number of communities [10, 24], which are referred to as subreddits and denoted by “r/” followed by the community’s topic, such as r/politics. The plethora of online communities provides a great opportunity to understand human interactions across multiple communities. Some “loyal” users may commit to one particular community and maintain a stable engagement [8], some may jump across several communities [24], while others may change their community affiliation over time. Such phenomena are fundamental examples of user dynamics in online communities, and understanding how users migrate across communities is an important problem. By analyzing and characterizing these dynamics in terms of users’ behavior patterns can provide useful insights for designing better communities, guide community managers to provide better services to improve user engagement, and help community-related stakeholders (e.g., sports teams, celebrities, advertisers) to better promote their business.

The bandwagon phenomenon is a widespread phenomenon in online sports communities. By definition, “bandwagon fans” refer to sports fans who start following a sports team only because of its recent success, and this group of fans will be gone immediately after the team performs poorly. Reddit officially introduced the bandwagon mechanism in some sports-related discussion groups (e.g., r/NBA and r/NFL) several years ago, which allows users to change their team affiliation and self-identify themselves as bandwagon fans during playoffs. For instance, during playoffs in NBA season 2016-17, 17.9% of Cavaliers’ fans in r/NBA were bandwagon fans. Our work focuses on examining this specific phenomenon, “bandwagon fan”, in online NBA fan communities. We leverage the existing structure of NBA-related discussion forums on Reddit to study users’ bandwagon behavior in the context of professional sports, a domain that is understudied yet closely connected to people’s daily life. We choose online fan groups of professional sports teams as a testbed for the following reasons. First, professional sports play a significant role in modern life and a large population is actively engaged [7, 17]. Second, professional sports teams are unambiguously competitive in nature and users affiliated with different teams have clearly different preferences [30].

Present work. Specifically, using users’ posts and comments in r/NBA and individual team subreddits across three recent NBA seasons, we aim to answer three research questions: (1) What is the general pattern of bandwagon phenomenon and its relationship with team performance? (2) Are there behavioral features that differentiate bandwagon users from non-bandwagon users? and (3) How effective can we identify bandwagon users and predict future bandwagon behavior using these features? Our large-scale study reveals that better teams attract more bandwagon fans and bandwagon fans typically switch to better-performing teams, but not all bandwagon fans come from weak teams. Also, as the playoffs progresses, bandwagon fans from different teams show different team change flow patterns. We also identify clear behavioral differences between bandwagon and non-bandwagon users in terms of their activity and language usage after applying a matching technique, e.g., bandwagon users tend to leave shorter comments but receive better feedback; and they are less attached to their affiliated teams. Using the features we identify, we are able to improve the bandwagon fan classification and prediction results over the bag-of-words baseline method, with 18.9% and 47.6% relative improvement, respectively.

Our work contributes to the following aspects: First, to the best of our knowledge, this is the first large-scale analysis of bandwagon behavior in sports community, which reveals clear behavioral characteristic differences of bandwagon fans compared with non-bandwagon fans. Second, using the observed behavioral characteristics, we can better distinguish bandwagon users and predict future bandwagon behaviors. Third, our work offers new insights for user loyalty research and online community management.

2 Related Work

User engagement and loyalty in online communities. Online community engagement has been a topic of active research [1, 13, 24, 4, 21, 27, 28]. Danescu-Niculescu-Mizil et al. [4] build a framework to track users’ linguistic change in an intra-community scenario and find they follow a determined two-stage lifecycle: an innovative learning phase and a conservative phase. Tan et al. [24] study users’ multi-community engagement in Reddit through community trajectory, language usage and feedback received. They find that over time, users span more communities , “jump” more and concentrate less; users’ language seems to continuously grow closer to the community’s; frequent posters’ feedback is continually growing more positive; and departing users can be predicted from their initial behavioral features. Loyalty is a fundamental measure of users maintaining or changing affiliation with single or multiple communities. Users’ loyalty with a single community is studied via churning in question-answering platform  [5, 19], where gratitude, popularity related features, and temporal features of posts are shown to be predictive. Multi-community loyalty in both community and user level is studied by Hamilton et al. [8], where loyalty is defined as users making the majority or minority of their posts to a certain forum in a multi-forum site at a specific time. They find loyal communities tend to be smaller, less assertive, less clustered and have denser interaction networks, and user loyalty can be predicted from their first three comments to the community. Different from prior work, our study focuses on the self-defined bandwagon status in sub-communities (different teams) within a large, single community (r/NBA).

Bandwagon phenomenon. The bandwagon phenomenon is found in many fields such as politics [15], information diffusion [16], sports [29], and business applications [23]. In Sundar et al.’s work [23], by conducting an experiment using fake products with different star ratings, number of customer reviews, and sales rank on an e-commerce site, they provide the preliminary support for bandwagon effect on users’ intention and attitude toward products. Zhu et al. [31] also find that other people’s opinions significantly sway people’s own choices via an experiment in an online recommender system. In Wang et al.’s work [26], they predict article life cycles in Facebook discussion groups using the bandwagon effect. Voting is another behavior that is related to this phenomenon, where voters may or may not follow their own opinion to make a voting decision but follow the majority [12, 25]. Most of these research efforts conclude at observing the bandwagon phenomenon at the application level. Further study is needed to analyze the specific characteristics, especially in the context of online communities.

3 Dataset

We focus on the professional sports context derived from NBA-related discussion forums (r/NBA and individual team subreddits) on Reddit, an active community-driven platform where users can submit posts and make comments. The r/NBA and team subreddits provide an ideal testbed for observing and understanding bandwagon fans, because a user’s team affiliation can be acquired directly through a mechanism known as “flair”. Flair appears as a short text next to the username in posts and comments (e.g., Lakers ). In r/NBA, users can use flairs to indicate support and each user cannot have multiple flairs at the same time. After a specific date (referred to as the bandwagon date in the rest of this paper) which is usually shortly before playoffs begin in each season, each user is given the option to change his/her flair to a bandwagon flair of a different team (e.g., Warriors Bandwagon ). A user can change the flair as many times as he/she wants. We obtain 0.6M posts and 30M comments as well as their received feedback in NBA-related subreddits from https://pushshift.io [2], where flair is used to determine the user’s bandwagon status. As pointed out by Zhang et al. [29], offline NBA seasons are reflected in users’ behavior in these NBA-related subreddits. As such, we organize our data according to the timeline of NBA seasons and focus on three seasons: 2015-16, 2016-17, and 2017-18.

4 General Patterns of Bandwagon Phenomenon

Our first research question aims to identify the general pattern of bandwagon behavior. We extract all flair changes from team A to team B bandwagon, where A B. A general user flow network in is shown in Fig. 3.

4.0.1 Observation 1: Better teams attract more bandwagon fans, but not all of them come from weak teams.

We first investigate which teams bandwagon fans switch to (i.e., target team). Intuitively, we expect the bandwagon fans move to better teams and abandoning the weak ones. Here, we consider the number of bandwagon fans who switched to team B (the target team) and its correlation with either team B’s standing (i.e., rank) or the difference between team B’ standing and team A (source team)’s standing. Table 1 shows the correlations computed for each scenario and each season. We can see strong correlations in all but one case. The correlations are negative because lower standing means better performance (e.g., the best team ranks 1), and if team B is better than team A then the standing difference (B’s standing - A’s standing) should be negative. Based on the results, we can conclude that better teams tend to attract more bandwagon fans.

Correlation between team B’s
standing and user count
Correlation between
standing difference and user count
2017-18 0.188 (not significant)
Table 1: Correlation results for team standing and standing difference with user count. Throughout this paper, the number of stars indicate p-values: , ,

We further study where the bandwagon fans switch from (i.e., the source team A). We examine the correlation of team A’s standing with user count. To our surprise, we do not observe any significant correlation for all the three seasons. This indicates that many bandwagon fans also come from strong teams. To better understand this, we calculate each team’s bandwagon fan ratio and select the teams with a ratio above the median of all teams. We find that in the three seasons, there are 8, 7, 8 teams respectively which are in the playoffs but have above-median ratio of fans leaving team. For instance, Raptors and Spurs are top 3 in their conference in all three seasons but still have a high percentage of fans leaving. Although bandwagon is only one aspect of loyalty, this result supplements the finding in Zhang et al.’s work on fan loyalty [29]: top teams

Figure 1: Bandwagon fan flair change trend in season 2015-16.

tend to have lower fan loyalty, in terms of user retention. As can be seen in Fig. 3, not all bandwagon fans come from weak teams and strong teams can also lose a large number of bandwagon fans.

4.0.2 Observation 2: Bandwagon flair changes exhibit different stage-wise trends across teams.

NBA playoffs are elimination tournaments which include 4 rounds: conference first round, conference semi-final, conference final and overall final. To keep our analysis consistent with the temporal structure of NBA playoffs, we divide the period after the bandwagon date into 5 stages: the pre-playoffs stage (stage 0) and the other 4 stages corresponding to the 4 rounds in playoffs, to examine the temporal dynamics of user’s bandwagon behaviors. As shown in Fig. 1, we examine the number of bandwagon fans in the 5 stages. We find three types of representative trends: (1) Most teams’ bandwagon fans abandon their original teams in stage 0 and there are still fans leaving in later stages but in decreasing numbers. For instance, bottom teams such as Lakers and Knicks have a large number of vagrant fans, and better teams such as Clippers and Heat also fall into this category due to some vagrant fans. (2) Some teams have very few bandwagon fans in the whole period. Such teams include top ones (e.g., Cavaliers and Warriors) and bottoms ones (e.g., Grizzles). Those teams have a strong fan base. (3) Some teams have a spike during the playoffs, which means most of their bandwagon fans abandon the team in a specific stage and very few leave before or after that stage. Three teams fall into this category in season 2015-16: Thunder, Raptors, and Celtics. These teams have a relatively strong fan base compared with the teams in (1), but some fans lose interest in the middle (e.g., team is eliminated) and choose to support a stronger team. Please note that the teams falling into these three trend categories can vary from season to season depending on each team’s performance in a season.

5 Behavioral Features of Bandwagon Users

Our second research question aims to identify behavioral features that help differentiate bandwagon fans from non-bandwagon fans. To do this, we apply a matching technique to make the two groups comparable. We first give a formal definition of bandwagon users and identify all these users. Then we match each bandwagon user with a non-bandwagon user

who has a similar activity level. This allows us to directly compare the behavioral features of bandwagon and non-bandwagon fans.

5.1 Identify Bandwagon Users for Behavioral Comparison

To identify bandwagon users within one season, we first define active users in r/NBA as those who have at least five activities before the bandwagon date and one activity after, where an activity refers to either submitting a post or making a comment.

We view posting/commenting with a team’s flair in r/NBA as an indication of support towards that team. We define an active user as a fan of a team during a specific period of time if the user indicates support (flair) only for that team and such support sustains over all activities during that time period. Note that this flair does not contain the word “bandwagon”. If the user only uses flair with “bandwagon” for a team, we call that user a bandwagon fan of that team.

We further define bandwagon user and non-bandwagon user based on whether a fan changes to a bandwagon fan of another team. To summarize, we consider the following two groups of users:

  • Non-bandwagon user: Fan of a team A throughout the whole season.

  • Bandwagon user: Fan of a team A till a time point, and after that point, becomes bandwagon fan of team B ( B A) for a period of time, regardless of any bandwagon changes thereafter.

The terms “fan”, “bandwagon fan”, “bandwagon users” and “non-bandwagon users” in this section all refer to the definition above.

There are also a number of users who are not in the aforementioned two groups. For instance, one user can be a fan for different teams during the whole season, while others may not be fans of any team. We do not consider those cases in our analysis since they do not link directly to the bandwagon phenomenon that our work focuses on. Statistics of the bandwagon and non-bandwagon users for our analysis are shown in Table 2.

Season #Bandwagon users #Non-bandwagon users
2015-16 2,562 23,165
2016-17 1,526 29,955
2017-18 1,163 36,053
Table 2: Number of Bandwagon and Non-bandwagon Users

5.2 User Matching

To make a fair comparison between the two user groups, we need to rule out the influence of activity level. Specifically, for each bandwagon user who is a fan of team A at the beginning, we find a matching non-bandwagon user who has a similar activity level and supports the same team. As a result, we have 2562, 1526, and 1163 user pairs after user matching in the three seasons, respectively.

To evaluate the result of our matching procedure, we check distributional differences in terms of the number of activities between the treatment group (bandwagon users) and the control group (non-bandwagon users). We compare their empirical cumulative distributions before and after matching, using the Mann-Whitney U test [14]. Prior to matching, the p-value for this feature is very close to 0.0, indicating significant difference. After matching, we find no difference between the treatment group and the matched control group at the 5% significance level in all three seasons ( = 0.485, 0.489, 0.49 for the three seasons), indicating that the data is balanced in terms of activity level after user matching. Fig. 4 in the Appendix shows the CDF plots before and after matching. Using the matched user groups, we then characterize how bandwagon users behave differently in terms of their posting/commenting activities and language usage. For consistency, we only consider user activities that occur before the bandwagon date. Since users cannot change their flairs to bandwagon flairs prior to that date, the differential features we observe can also be used for predicting future bandwagon behavior in playoffs.

5.3 Activity Features

5.3.1 Observation 1: Bandwagon users are less active than non-bandwagon users in individual team subreddit.

We compare users’ activity level in the same individual team subreddit after matching in r/NBA, and find that bandwagon users have fewer activities (i.e., more silent) than non-bandwagon users in terms of posting/commenting count: (19.91 vs. 29.40, 23.55 vs. 35.63, 23.51 vs. 32.4 for the three seasons, respectively). This is reasonable since individual team subreddits tend to attract fans who are more loyal/dedicated to their teams and participate more actively in their subcommunities.

5.3.2 Observation 2: Bandwagon users write shorter comments but receive better feedback in r/NBA.

Here, we compute a score for each comment (#upvotes - #downvotes) as a measure of received feedback. The higher the score, the better the feedback it receives. Our results show that bandwagon users write significantly shorter comments (18.71 vs. 20.75, 18.34 vs. 20.30, 18.26 vs. 19.55 for the three seasons, respectively), but receive better feedback (11.08 vs. 9.88, 13.11 vs 11.80, 16.64 vs 16.32 for the three seasons, respectively). This observation is consistent with the general pattern that users who “wander around” across diverse communities are more likely to receive better feedback [24].

5.4 Language Usage Features

Although we use both posting and commenting as indicators of user activity level, we focus more on comments in language usage analysis since posts are more about news and game reports, and less reflective of users’ personal characteristics.

5.4.1 Observation 3: Bandwagon users talk less about specific players and teams.

To analyze the content of users’ comments, we first conduct preprocessing steps including lowercasing words and removing stop words and hyperlinks from comments. After that, Latent Dirichlet Allocation (LDA) [3], a widely-used topic modeling method, is applied to extract keywords and topics. In our case the perplexity score drops significantly when the number of topics increases from 5 to 10, but remains stable afterwards (from 10 to 30). Therefore, we use 10 as our topic number in this analysis.

We compare the topics of bandwagon comments and that of non-bandwagon ones. While they share many similar topics (e.g., game strategy related words: defense, offense; emotional expressions related words: shit, lol), there still exist some differences. Non-bandwagon users talk more about specific teams/players (e.g., Harden, spursgame), even when they are talking about similar topics as bandwagon users. It shows that bandwagon users appear to be less concerned about the details of teams/players, indicating a relatively indifferent attitude towards their affiliated teams.

5.4.2 Observation 4: Bandwagon users are less dedicated to discussions in terms of word usage.

Inspired by Hamilton et al. [8] and previous observations, we examine the two groups’ word usage. To capture the esoteric content and users’ attachment to the teams, we calculate the proportion or summary variable of different types of words in their comments using LIWC word categories [18], a well known set of word categories that were created to capture people’s social and psychological states. We find significantly less word use of bandwagon comments in five word categories: clout [11] (high clout value suggests that the author is texting from the perspective of high expertise and is confident), social process (words that suggest human interaction), cognitive process (words that suggest cognitive activity), drives (an overarching dimension that captures the words that represent the needs, motives and drives including achievement, power, reward, etc.), and future focus (future tense verbs and references to future events/times). The results are shown in Table 3

. Please note that the results show the average value of bandwagon users versus that of non-bandwagon users. All these five lexicon categories show that non-bandwagon users have a closer attachment to their affiliated teams and a more proactive attitude towards discussions.

Word category Season 2015-16 Season 2016-17 Season 2017-18
clout 52.36 vs. 53.76 52.24 vs. 53.07 52.13 vs. 53.39
social process 9.26 vs. 9.42 9.45 vs. 9.70 9.48 vs. 9.74
cognitive process 10.52 vs. 10.86 10.52 vs. 10.90 10.40 vs. 10.64 ()
drives 7.44 vs. 7.58 7.59 vs. 7.77 7.41 vs. 7.59 ()
future focus 1.09 vs. 1.13 1.10 vs. 1.18 1.12 vs. 1.19
Table 3: LIWC Word Categories Analysis Results

6 Bandwagon User Classification and Prediction

To demonstrate the differential power of the behavioral features we have identified in the previous section, we formulate a classification task and a prediction task to investigate how the activity and language usage features can be used for identifying bandwagon users and inferring future bandwagon behavior.

6.1 Experimental Setup

6.1.1 Tasks.

Our first task is a classification task that aims to distinguish between bandwagon and non-bandwagon users. We take all users (after matching) across the three seasons, and randomly select 80% of the users’ data as the training set and the remaining 20% as the testing set.

Our second task is to predict whether a user will become a bandwagon user (i.e., change his/her flair to a bandwagon flair) during a season, based on his/her behavior before the bandwagon date in that season. We take users’ data in season 2015-16 and 2016-17 to train our model, and apply it to the data in season 2017-18 to predict if a user will jump on the bandwagon in that season.

Features. Based on previous observations, we extract two types of features to train our classification and prediction models.

  • Activity features: This set of features includes average comment length and average comment feedback score as discussed in the previous section.

  • Language features: This set of features includes the average summary variable of clout [11], average word proportion in terms of social process, cognitive process, drives, and future focus, as discussed in the previous section.

We use Bag-of-words (BOW) features as a strong baseline since BOW not only effectively captures the content of users’ comments [9], but also requires no pre-observational study. Please note that all the aforementioned features are extracted from the period between season beginning and bandwagon date.

Evaluation procedure. As the analyses before, we conduct the same user matching process in the first place. We label bandwagon users as positive examples. To evaluate the effectiveness of the classification and prediction tasks using the behavioral features we have identified, we deploy a standard

-regularized logistic regression classifier, and use grid search to find the best parameters. All the results in the classification task are derived after 5-fold cross-validation. We consider both precision and recall as the evaluation metrics.

6.2 Results

Fig. 2 summarizes the classification and prediction performance when using different feature sets. As shown in the figure, the two sets of behavioral features that we identify (activity and language) improve precision for both tasks. When combined with BOW features, there is further improvement (18.9% for classification and 47.6% on prediction, as compared with the baseline result). These results indicate that the activity and language features we identify are good complements for BOW text representations to reduce false positive.

Figure 2: Classification and prediction performance using different feature sets.

However, we do notice that our activity and language features do not work as well on improving recall in both tasks. Although they improve the classification recall when combined with BOW features, the recall is lower than the BOW only scenario. One possible reason is that some non-bandwagon users actually behave as bandwagon ones but fail to report their bandwagon flairs on Reddit. We find some non-bandwagon users who start to follow another stronger team’s games and news after their original affiliated teams are eliminated, but do not change their flair, especially for users in season 2017-18. These “fake” non-bandwagon users can confuse our model, resulting in missed bandwagon detection. The recall performance of using all features for the classification task has been improved because we include all three seasons’ users in our training set and the combined features can catch some good indicators of “fake” non-bandwagon users for each season, while for the prediction task, the training data does not contain any users’ information in the 3rd season, 2017-18.

7 Concluding Discussion

In this work, we have analyzed the bandwagon phenomenon (a common case of user dynamics) using NBA-related subreddits data from Reddit. We find that better teams attract more bandwagoners, but bandwagoners do not necessarily come from weak teams. Most teams’ vagrant fans leave their teams and jump on another team’s bandwagon at the beginning, while some teams have a relatively stronger fan base and their bandwagon fans leave when the teams are eliminated. In the comparison after user matching, we find that bandwagon users write shorter comments but receive better feedback, and use words that show less attachment to their affiliated teams. These features can effectively help classify bandwagon users and predict users’ future bandwagon behavior.

7.0.1 Implications for user loyalty.

Our results show that loyalty plays an important role in online communities. Bandwagoners have clear behavioral differences from non-bandwagoners. It is crucial for community managers to identify loyal and vagrant fans with the goals of maintaining and growing their user base. To this end, our classification and prediction models show the feasibility of automating these identification processes, and demonstrate the great potential of incorporating such capabilities as a more standard pipeline.

Our work also complements research on user and multi-community engagement, and offers insights on how users behave across sub-communities. The bandwagon phenomenon in our study is about user’s preference change in sub-communities, where users share the common interest (basketball), but have different preferences towards the teams.

In addition, we find that bandwagoning in r/NBA is different from general loyalty. We notice that around 80% of the bandwagoners in seasons 2015-16 and 2016-17 change their flairs back to their original teams in the following season, which means that bandwagoning is a “temporary” non-loyal behavior for most users. Thus, one possible future direction is to investigate what factors account for their choice of bandwagoning and willingness to stay with the new team.

7.0.2 Implications for sports community management.

Our findings on bandwagon users’ characteristics can be useful for sports team management. Firstly, our observations reveal that bandwagon fans are not necessarily from weak teams, which suggest that some higher-ranked teams also need to pay close attention to maintaining their fan base. As mentioned earlier, Spurs is a good example. Furthermore, since bandwagon users tend to move “up” to higher-ranked teams, a strategy to gain some temporary support for the strong-but-not-top teams is to attract more “travelers” during the playoffs. For example, during the 2016 western semi-final between Thunder and Warriors, Thunder acted as a challenger and highlighted their two star players Kevin Durant and Russell Westbrook. These actions brought them lots of fans from other teams, especially from the teams that were defeated by Warriors.

Secondly, it is important to keep fans engaged in online discussion during off-season, especially when certain fans’ affiliated teams are eliminated. Prior work has shown that incorporating group identity can help strengthen member attachment in online communities [20]. Our results show that this bandwagon mechanism, i.e., allowing users to switch team affiliation, does have some effective impact on not only encouraging some weak teams’ fans to change their flairs and participate in other teams’ discussion, but also encouraging certain strong teams’ fans to go “up” to a better team when their teams are eliminated during playoffs.

7.0.3 Limitations and future work.

One key limitation of our work is the representativeness of our dataset. Although Goldschein [6] suggests that /r/NBA is now playing an important role among fans, the NBA fan communities on Reddit may not be representative of the whole communities. Another limitation is that the bandwagon identity requires users’ self-identification. As discussed earlier, the recall is low because there are some “fake” loyal users who do not use bandwagon flairs but act the same as bandwagon users. We also notice that the number of bandwagon users is decreasing, which means that fewer users are “serious” about using this bandwagon flair mechanism. Future directions to address this include designing better strategies in online communities to promote user behavior diversity, and designing better metrics and algorithms to identify real vagrant fans.

Another important question to ask is why users choose to bandwagon. In our analysis we find some fans jump on the bandwagon because their original teams are eliminated and they turn to another team just to have something to watch. Another finding is that some fans jump to teams which are opponents of their “enemy” team, i.e., “Enemy of my enemy is my friend”. Answering this question will help provide a fundamental explanation to the bandwagon behaviors.

In addition to online community, the bandwagon effect also plays an important role in information diffusion [16], and impacts the propagation of fake news  [22], where the popularity of news allows users to bypass the responsibility of verifying information. One future direction is to investigate how bandwagoning affects users and helps fake news spread, and how to identify the impacted users.


  • [1] K. K. Aldous, J. An, and B. J. Jansen (2019) Predicting audience engagement across social media platforms in the news domain. In SocInfo, pp. 173–187. Cited by: §2.
  • [2] J. Baumgartner (2018) Reddit dataset. Note: https://files.pushshift.io/reddit/ Cited by: §3.
  • [3] D. M. Blei, A. Y. Ng, and M. I. Jordan (2003) Latent dirichlet allocation. JMLR 3 (Jan), pp. 993–1022. Cited by: §5.4.1.
  • [4] C. Danescu-Niculescu-Mizil, R. West, D. Jurafsky, J. Leskovec, and C. Potts (2013-05) No country for old members: user lifecycle and linguistic change in online communities. In WWW, WWW ’13, New York, NY, USA, pp. 307–318. Cited by: §2.
  • [5] G. Dror, D. Pelleg, O. Rokhlenko, and I. Szpektor (2012-04) Churn prediction in new users of yahoo! answers. In WWW, WWW ’12 Companion, New York, NY, USA, pp. 829–834. Cited by: §2.
  • [6] E. Goldschein (2015) It’s time to give /r/nba the respect it deserves.. Note: https://www.sportsgrid.com/as-seen-on-tv/media/its-time-to-give-rnba-the-respect-it-deserves Cited by: §7.0.3.
  • [7] A. Guttmann (2004) From ritual to record: the nature of modern sports. Columbia University Press. Cited by: §1.
  • [8] W. L. Hamilton, J. Zhang, C. Danescu-Niculescu-Mizil, D. Jurafsky, and J. Leskovec (2017) Loyalty in online communities. In ICWSM, Cited by: §1, §2, §5.4.2.
  • [9] Z. S. Harris (1954) Distributional structure. Word 10 (2-3), pp. 146–162. Cited by: §6.1.1.
  • [10] J. Hessel, C. Tan, and L. Lee (2016-03) Science, AskScience, and BadScience: on the coexistence of highly related communities. In ICWSM, (en). Cited by: §1.
  • [11] E. Kacewicz, J. W. Pennebaker, M. Davis, M. Jeon, and A. C. Graesser (2014) Pronoun use reflects standings in social hierarchies. Journal of Language and Social Psychology 33 (2), pp. 125–143. Cited by: §5.4.2, 2nd item.
  • [12] Á. Kiss and G. Simonovits (2014-09) Identifying the bandwagon effect in two-round elections. Public Choice 160 (3), pp. 327–344. Cited by: §2.
  • [13] J. Mahmud, J. Chen, and J. Nichols (2014) Why are you more engaged? predicting social engagement from word use. arXiv preprint arXiv:1402.6690. Cited by: §2.
  • [14] H. B. Mann and D. R. Whitney (1947)

    On a test of whether one of two random variables is stochastically larger than the other

    The annals of mathematical statistics, pp. 50–60. Cited by: §5.2.
  • [15] I. McAllister and D. T. Studlar (1991-08) Bandwagon, underdog, or projection? opinion polls and electoral choice in britain, 1979-1987. J. Polit. 53 (3), pp. 720–741. Cited by: §2.
  • [16] R. Nadeau, E. Cloutier, and J. Guay (1993) New evidence about the existence of a bandwagon effect in the opinion formation process. International Political Science Review 14 (2), pp. 203–213. Cited by: §2, §7.0.3.
  • [17] Nielsen.com (2016) The year in sports media report: 2015. Note: https://www.nielsen.com/us/en/insights/reports/2016/the-year-in-sports-media-report-2015.html Cited by: §1.
  • [18] J. W. Pennebaker, R. L. Boyd, K. Jordan, and K. Blackburn (2015) The development and psychometric properties of liwc2015. Technical report Cited by: §5.4.2.
  • [19] J. S. Pudipeddi, L. Akoglu, and H. Tong (2014-04) User churn in focused question answering sites: characterizations and prediction. In WWW, WWW ’14 Companion, New York, NY, USA, pp. 469–474. Cited by: §2.
  • [20] Y. Ren, F. M. Harper, S. Drenner, L. Terveen, S. Kiesler, J. Riedl, and R. E. Kraut (2012) Building member attachment in online communities: applying theories of group identity and interpersonal bonds. Mis Quarterly, pp. 841–864. Cited by: §7.0.2.
  • [21] M. Rowe (2013) Changing with time: modelling and detecting user lifecycle periods in online community platforms. In SocInfo, pp. 30–39. Cited by: §2.
  • [22] C. Shao, P. Hui, L. Wang, X. Jiang, A. Flammini, F. Menczer, and G. L. Ciampaglia (2018) Anatomy of an online misinformation network. PloS one 13 (4), pp. e0196087. Cited by: §7.0.3.
  • [23] S. S. Sundar, A. Oeldorf-Hirsch, and Q. Xu (2008) The bandwagon effect of collaborative filtering technology. In CHI’08 extended abstracts on Human factors in computing systems, pp. 3453–3458. Cited by: §2.
  • [24] C. Tan and L. Lee (2015) All who wander: on the prevalence and characteristics of multi-community engagement. In Proceedings of the 24th International Conference on World Wide Web, pp. 1056–1066. Cited by: §1, §2, §5.3.2.
  • [25] T. W. G. van der Meer, A. Hakhverdian, and L. Aaldering (2016-03) Off the fence, onto the bandwagon? a Large-Scale survey experiment on effect of Real-Life poll outcomes on subsequent vote intentions. Int J Public Opin Res 28 (1), pp. 46–72. Cited by: §2.
  • [26] K. C. Wang, C. Lai, T. Wang, and S. F. Wu (2015) Bandwagon effect in facebook discussion groups. In Proceedings of the ASE BigData & SocialInformatics 2015, pp. 17. Cited by: §2.
  • [27] J. S. Zhang, B. C. Keegan, Q. Lv, and C. Tan (2020) A tale of two communities: characterizing reddit response to covid-19 through/r/china_flu and/r/coronavirus. arXiv preprint arXiv:2006.04816. Cited by: §2.
  • [28] J. S. Zhang and Q. Lv (2017) Event organization 101: understanding latent factors of event popularity. In ICWSM, Cited by: §2.
  • [29] J. S. Zhang, C. Tan, and Q. Lv (2018) This is why we play: characterizing online fan communities of the nba teams. Proceedings of the ACM on Human-Computer Interaction 2 (CSCW), pp. 197. Cited by: §2, §3, §4.0.1.
  • [30] J. S. Zhang, C. Tan, and Q. Lv (2019) Intergroup contact in the wild: characterizing language differences between intergroup and single-group members in nba-related discussion forums. Proceedings of the ACM on Human-Computer Interaction 3 (CSCW), pp. 1–35. Cited by: §1.
  • [31] H. Zhu, B. Huberman, and Y. Luon (2012) To switch or not to switch: understanding social influence in online choices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2257–2266. Cited by: §2.

Appendix 0.A Appendix

Fig. 3 shows the general bandwagon fan flow of seasons 2015-16 and 2016-17, which validate Observation 1 in Section 4. Fig. 5 is a stage-by-stage bandwagon fan flow. As shown in these flow graphs, bandwagon users keep leaving their original teams and going to stronger ones. There are fewer bandwagon users in later stages than that of the early stages. In stages 3 and 4, only three teams remain as bandwagon target teams. Fig. 4 shows the results of Mann-Whitney U test before and after user matching.

(a) Bandwagon fan flair change flow in season 2015-16
(b) Bandwagon fan flair change flow in season 2016-17
Figure 3: General bandwagon fan flair change flow in different stages in seasons 2015-16 and 2016-17: from source team (team A, left nodes) to target team (team B, right nodes).
(a) Before, 2015-16
(b) Before, 2016-17
(c) Before, 2017-18
(d) After, 2015-16
(e) After, 2016-17
(f) After, 2017-18
Figure 4: Mann-whitney U test results before and after matching. Before matching: randomly select the same number of non-bandwagon users as the number of bandwagon users and run the test. After matching: run the test on the two matched user groups.
(a) Stage 0
(b) Stage 1
(c) Stage 2
(d) Stage 3
(e) Stage 4
Figure 5: Bandwagon fan flow in different stages in season 2015-16: From source team (team A, left nodes) to target team (team B, right nodes).