Socialbots supporting human rights

10/31/2017 ∙ by E. Velázquez, et al. ∙ 0

Socialbots, or non-human/algorithmic social media users, have recently been documented as competing for information dissemination and disruption on online social networks. Here we investigate the influence of socialbots in Mexican Twitter in regards to the "Tanhuato" human rights abuse report. We analyze the applicability of the BotOrNot API to generalize from English to Spanish tweets and propose adaptations for Spanish-speaking bot detection. We then use text and sentiment analysis to compare the differences between bot and human tweets. Our analysis shows that bots actually aided in information proliferation among human users. This suggests that taxonomies classifying bots should include non-adversarial roles as well. Our study contributes to the understanding of different behaviors and intentions of automated accounts observed in empirical online social network data. Since this type of analysis is seldom performed in languages different from English, the proposed techniques we employ here are also useful for other non-English corpora.



There are no comments yet.


page 5

page 6

page 8

page 9

page 10

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Bot identification and data preparation

Figure 1. Bot versus human activity using #Tanhuato, from streamed tweets during collection period.

In order to detect bots we use BotOrNot

, a general supervised learning system designed for detecting socialbot accounts on Twitter

[7]. It utilizes over 1,000 features such as user meta-data, social contacts, diffusion networks, content, sentiment, and temporal signatures. Based on evaluation on a large set of labeled accounts, BotOrNot is extremely accurate in distinguishing bots from humans accounts, with an Area Under the ROC Curve (AUC) of 94%.

When a twitter account is evaluated in BotOrNot, the output is a JSON file with several scores. As we are examining a corpus of tweets in Spanish we focus on language-independent classifiers, which show a large number of potential bot accounts. Surprisingly, combining the results of these language-independent classifiers is sufficient for detecting bots in Spanish. This suggests that simply discarding the language-dependent features of BotOrNot can yield to non-English bot detection. Further research should be done to validate the transferability of BotOrNot outside of English Twitter.

We streamed 20,854 tweets from Twitter’s API between 2016-08-19 15:06:17 and 2016-08-22 02:13:35. These tweets were generated by 9730 different users (see Figures 1 and 6 for the relation between humans and bots), and among them we have 12905 retweets. When a user (human or bot) generates a tweet, and this tweet can be retweeted by a bot or a human. Consequently, we find four possibilities: a tweet created by a human and retweeted by another human (H-H), created by a human and retweeted by a bot (H-B), created by bot and retweeted by human (B-H) or bot (B-B). In Figure 1 we show the evolution of #Tanhuato in the collection period. The percentages of accounts that are humans and those that are bots are shown in Figure 5.

In Figure 3

we show the bi-variate kernel decomposition estimates for pairwise combinations of the Friend, Network, and Temporal classifiers form

BotOrNot. The regions towards the upper right hand corner correspond to areas where the bot scores are high. It can be clearly seen how the bot accounts naturally cluster. The final visualization of this analysis is presented in Figure 4

, where we now compute the kernel density estimate that incorporates the three classifiers Friend, Network, and Temporal. In this image the smaller cluster in the upper right corner is the region where the bots accumulate. This 3D image is formed by taking iso-surfaces obtained from the 3D kernel density estimate. Again, as in the 2D images, we can separate the bot accounts in a natural way, to isolate them for further analysis. Notice that these three classifiers are all non-language specific and this is the reason behind focusing on them instead of on the overall bot score produced by

BotOrNot. Having identified the bots present in our sample, we can now understand how the appeared over the collection period, as shown in Figure 5.

Figure 2. Kernel decomposition estimate for Friend (left) and Network (right) from Bot-Or-Not, for #Tanhuato, 19-21st August 2016, sample obtained through Twitter’s streaming API.
Figure 3. 2D Kernel decomposition estimate for Friend, Network, and Temporal classifiers from Bot-Or-Not, for #Tanhuato, 19-21st August 2016, sample obtained through Twitter’s streaming API.
Figure 4. 3D Kernel decomposition estimate for Friend, Network, and Temporal classifiers from Bot-Or-Not, for #Tanhuato, 19-21st August 2016, sample obtained through Twitter’s streaming API.
Figure 5. Percentages of Bot versus human activity using #Tanhuato, from streamed tweets during the collection period.
Figure 6. Left: Percentage of different human and bot accounts in collected data. Volume of registered retweets by user type. Right: Clasification is as follows: humans retweeting humans (H-H), bots retweeting humans.

Network Analysis

54649261 0.0011778193 H
2903265492 0.0006699197 H
3060823412 0.0005209566 H
249005175 0.0004625333 H
1190644922 0.0004146598 H
163552910 0.0002160787 H
222959337 0.0001811033 H
78941875 0.000122997 H
Table 1. Table of accounts with highest betweenness-centrality from the full retweet network. Notice that all of these accounts are from human users. The third column is labeled H for human and B for bot.
3243658266 787 B
54649261 754 H
163552910 594 H
84613584 471 B
520653311 438 H
435299501 368 H
35977487 328 H
318799346 212 H
252160277 211 H
1911952410 196 H
44554692 191 H
132346487 156 H
244218738 154 H
832309426182901760 141 H
825966216 140 H
200932969 131 H
18430394 121 H
296592711 119 H
43115590 119 H
190143362 114 H
Table 2. Table of accounts with highest betweenness-centrality from the full retweet network. Notice that only two are marked as bots, @pictoline and @Pajaropolitico, both of these accounts belong to news organizations. The third column is labeled H for human and B for bot.
3243658266 768 B
84613584 444 B
52998787 81 B
33884545 71 B
91430932 27 B
358862898 26 B
357050985 20 B
28608099 16 B
104683173 15 B
22721695 11 B
2605229921 9 B
558251048 9 H
93797343 9 B
3122019163 8 B
343452977 8 B
319883780 7 B
3907628182 7 B
2372256601 6 B
266390655 5 B
755873792250023936 4 B
3406088807 4 H
85123108 4 B
4251942192 4 H
Table 3. Table showing highest degree nodes in the retweet network. The third column is labeled H for human and B for bot.
(a) Retweet network betweenness centrality: the two bots are news organizations.
(b) Highest degree nodes (raw retweet counts) in the retweet network.
Figure 7. Distribution of centrality of bot and human Twitter accounts. . We only show the top Twitter accounts.

Now that we have performed our bot analysis, we can analyze the bot and human Twitter network. In Figure 6(a) we see that the nodes with the highest betweenness centrality in the full retweet network are all human, except for two accounts that belong to bots. These bot accounts are in fact official news organizations @pictoline and @Pajaropolitico. Thus, by the betweenness centrality in the retweet network, human users constitute the shortest paths of dialogue. With the exception of the formal news bots, socialbots are not playing an active role in the retweet network.

Figure 6(b) shows the number of retweets by each user (measure of degree in the retweet network) and that again humans are the more active retweeters. In Figure 6 (down) we find the relation between these quantities for our data. Furthormore, we observed in the data that the bots with the highest ammount of retweets among humans were mainly news organizations: @pictoline, @Pajaropolitico, @emeequis, @CNNEE, and @NewsweekEspanol.

In figure 8 we show the entire retweet network for our collection. It can be seen that very few bot accounts are responsible for a large proportion of the retweets by humans. This last point is also clear in figure 10, where only the retweets of bot tweets by humans are shown. Here the central nodes with high valency are the accounts that were retweeted most by humans. In contrast figure 9 shows that bots did not actually retweet themselves much. In fact most bot accounts lie in the outer circle, edgelessly isolated.

Figure 8. Retweet network for #Tanhuato, bots in red, humans in blue. Total of 6,528 nodes, and 10,011 edges.
Figure 9. Retweet network for #Tanhuato obtained from our sample, bots retweeting bots. All edges are shown, most of the nodes in the outer circle have no connecting edge. This network is composed of 92 nodes, 80 edges.
Figure 10. Retweet network for #Tanhuato obtained from our sample, for only humans retweeting bots. All edges are shown, all of the nodes in the outer circle have no connecting edge. This network is composed of 1550 nodes, 1596 edges.

The total number of tweets created by bots were 4153, this number represents the 19.9146% of all tweets. In total 12905 of all tweets are retweets. A total of 11895 retweets were done by humans, and 1010 retweets were done by bots.
The number of tweets created by bots and retweeted by humans is: 1450
The number of tweets created by humans and retweeted by bots is: 848
The number of tweets created by bots and retweeted by bots is: 76
The number of tweets created by humans and retweeted by humans is: 9896
There are more humans retweeted (10744) than bots retweeted (1526). There is a difference of 635 retweets: ‘all retweets ’=humans retweeted + bots retweeted + 635

The ‘missing’ 635 retweets belong to tweets created at previous time (before the fist tweet registe- red). Fortunately, retweets store the info of the original tweet. Searching the string http in the text of each tweet, we found that 17474 tweets from humans include web pages, and 4736 tweets from bots include web pages.

Text Analysis

We extract bag-of-words features represented as TF-IDF (term frequency–inverse document frequency) using [3]

. We then used Singular Value Decomposition (SVD, also referred to as Latent Semantic Indexing in the context of information retrieval and text mining) to look at the distribution of Tweets on the top singular vectors. While the top singular vectors capture the most variance in the bag-of-words features set, for this corpora the difference between the bot and human tweets was not clear. We also redid the analysis by removing Spanish stop words and still did not find any discrimination between bots and humans.

However, as seen in Figure 11

, by computing the log-odds ratio of the counts of words between the human and bot cohorts (as was done in

[18] for discriminating between two Tweet corpora), we see several terms that are discriminating. Thus, although the bag-of-words features do not capture strong discrimination between bots and humans, the two cohorts are clearly different (specific word usages among bots can be different orders of magnitude since the horizontal axis in Figure 11 is on a log scale).

To better understand the nature of words bots and humans used, we apply basic sentiment analysis using LabMT [9]. As discussed in [9], the top 10,000 Spanish words were presented to Amazon Mechanical Turk where 50 workers rated the happiness of each word on a scale of 1 to 9 (where 1 is least happy, 9 is most happy, and 5 is neutral). Using these scores for each word, we compute the average sentiment, for the human and bot corpora using Equation 1 in [9]. As discussed in [9] however, a great deal of words may have neutral sentiment (and are essentially commonly used stop words), and the average sentiment score may be biased heavily towards the neutral score of 5.0. Therefore, the authors suggest removing words that are within of 5.0 so that words with stronger sentiment remain. By selecting an appropriate , we can remove stop words in a systematic way that does not contribute to sentiment.

It is not clear what value to select for . While the authors in [9] suggest , here we compute the average sentiment score for for a more complete understanding. Figure 12, left panel, shows how the tweets average sentiment changes as we filter out more neutral words. As the neutral words are filtered, we see that the average sentiment is pulled down significantly. This is to be expected as most tweets are expressing words related to violence. Interestingly, however, the bots seem to be less emotional than the humans in that their average sentiment is consistently above humans regardless of what value we use.

Figure 11. Most discriminating words between Bots versus humans as computed by likelihood ratios.

To investigate this hypothesis further, we removed all retweets and recomputed the average sentiments. Figure 12, panel on the right, shows again that removing the retweets does not change the fact that filtering neutral words yields more negative words. However, we see that the bot sentiment does not correlate strongly with the human tweets. In other words, as we filter more neutral words, the human tweets become more negative as before. But the bot tweets remain closer to being neutral. These findings all suggest that the bots were using less emotionally charged words than humans. In other words, it appears that the purpose of the bots in this case was to only distribute information in a non-sensational manner rather than purposefully stir up emotions.

Figure 12. Left: Sentiment on Tweets using LabMT. As we filter out neutral words with the , we see that the sentiment from human is significantly lower than bots. Right: Sentiment on Tweets with retweets removed using LabMT. Again, as we filter out neutral words with the , we see that the sentiment from human is significantly lower than bots. However, the correlation between the human and sentiments is much lower when retweets are removed.

In addition to using LabMT, we also hand coded a list of negative words, extracted from the corpus of collected tweets, and used it to compare both the bot and human corpora according to the frequency of appearance of words in this list. In order to increase the comparability of these words in a wider volume of tweets, when possible, we suppressed some last letters (that is, we applied “stemming”) such that they could match with different tenses (in case of verbs) and different genders and numbers (in nouns and adjectives) keeping the connotation. We refer to Table 4 for this list of incomplete words.

To check matches between words in Table 4 and the text in tweets, we remove URLs from the text in tweets, replace non-ASCII characters (like “ñ ”, stressed vowel á,é,í,ó,ú and “?‘ ”) by their ASCII equivalent (“n”,a, e, i, o, u,“?”). We also transform all capital letters to lowercase. The transformed text were split into single words to compare individually. In order to increase comparison speeds, we group the words alphabetically and compare only with words starting with the same letter, skipping also words starting with symbols, numbers. Finally, we only check if the words in Table 4 with the same initial letter as each word in split message starts with the same letters.

To prevent a misplaced punctuation mark from not matching a word, a second analysis was performed suppressing the first letter in each word, and checking if this shorter word matches with Table 4. This analysis also reveals no difference. Our method of comparison fails when a negative sentiment word is misspelled, but one expects that the sentiment of the tweet remains congruent in the whole text. Then, if the text is long, we are more likely to find another negative word but spelled correctly. Conversely, short texts are more likely to have less misspelled words.

arma culpable jodid sanguinari
asesin delincuen levanton secuestro
asesinat dispara maltrat tortura
bala disparos masacre violacion
balazo ejecucion matanza violenta
brutal ejecut matar
cartel exterminio mentir
castigo fals muerte
corrupcion genocidio pistola
corrupt guerra represion
crimen incendia represiv
criminal jode sangriento
Table 4. List of negative feeling words (an is placed when letters can omitted without changes in connotation.

To distinguish what kind of information is most shared, we consider the total of tweets and assign a numerical value to each one. This value was initialized in 0 increased by a constant, depending on the number of matches with the Table 4. Assuming that a tweet has a negative feeling when its value is different to zero, we show in Figure 13 that the largest volume of tweets comes from retweets with a negative feeling text. A closer reading of the entire tweet corpus revealed that the most of the messages which are non-negative cannot be identified as positive or neutral. Their texts share URLs and/or the sentiment cannot be determined by word inspection.

Figure 13. The total volume of twitter texts were comparing with words in Table 4. left: Tweet classification in Negative and Non-negative. right Percentage of negative feeling texts by user type.


In this work we presented a case study of socialbots for a specific trending topic in Mexican Twitter. While numerous studies have suggested that socialbots act as disrupting agents of information, in our case study we found the opposite. The socialbots were in fact enabling the flow of information to ensure that the report about these atrocities reached the public and information was not stifled. Of course, from the point of the police authorities the socialbots may be viewed as agents of disruption and it is therefore a matter of perspective if socialbot are enablers or not. Our case study suggests that the role and landscape of socialbots is far more complex than simple binary categorizations. Our work highlights the need for further research to understand the ethical implications of such automated social activity.


We thank IPAM in UCLA and the organizers of the Cultural Analytics program, CNetS and the BotOrNot team in IU, and also Twitter for allowing access to data through their APIs. PSS acknowledges support from UNAM-DGAPA-PAPIIT-IN102716 and UC-MEXUS CN-16-43.


  • [1] Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22Nd International Conference on World Wide Web, WWW ’13, pages 119–130, New York, NY, USA, 2013. ACM.
  • [2] Yazan Boshmaf, Ildar Muslukhov, Konstantin Beznosov, and Matei Ripeanu. Design and analysis of a social botnet. Comput. Netw., 57(2):556–578, February 2013.
  • [3] Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake VanderPlas, Arnaud Joly, Brian Holt, and Gaël Varoquaux.

    API design for machine learning software: experiences from the scikit-learn project.

    In ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pages 108–122, 2013.
  • [4] Nikan Chavoshi, Hossein Hamooni, and Abdullah Mueen. Identifying correlated bots in twitter. In International Conference on Social Informatics, pages 14–21. Springer International Publishing, 2016.
  • [5] Zi Chu, Steven Gianvecchio, Haining Wang, and Sushil Jajodia. Who is tweeting on twitter: Human, bot, or cyborg? In Proceedings of the 26th Annual Computer Security Applications Conference, ACSAC ’10, pages 21–30, New York, NY, USA, 2010. ACM.
  • [6] Eric M. Clark, Jake Ryland Williams, Chris A. Jones, Richard A. Galbraith, Christopher M. Danforth, and Peter Sheridan Dodds. Sifting robotic from organic text: A natural language approach for detecting automation on twitter. Journal of Computational Science, 16:1 – 7, 2016.
  • [7] Clayton Allen Davis, Onur Varol, Emilio Ferrara, Alessandro Flammini, and Filippo Menczer. Botornot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web, WWW ’16 Companion, pages 273–274, Republic and Canton of Geneva, Switzerland, 2016. International World Wide Web Conferences Steering Committee.
  • [8] John P. Dickerson, Vadim Kagan, and V.S. Subrahmanian. Using sentiment to detect bots on twitter: Are humans more opinionated than bots? 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 00(undefined):620–627, 2014.
  • [9] Peter Sheridan Dodds, Kameron Decker Harris, Isabel M Kloumann, Catherine A Bliss, and Christopher M Danforth. Temporal patterns of happiness and information in a global social network: Hedonometrics and twitter. PloS one, 6(12):e26752, 2011.
  • [10] Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. The rise of social bots. Commun. ACM, 59(7):96–104, June 2016.
  • [11] Xia Hu, Jiliang Tang, Huiji Gao, and Huan Liu. Social spammer detection with sentiment information. In Proceedings of the 2014 IEEE International Conference on Data Mining, ICDM ’14, pages 180–189, Washington, DC, USA, 2014. IEEE Computer Society.
  • [12] Xia Hu, Jiliang Tang, Yanchao Zhang, and Huan Liu. Social spammer detection in microblogging. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI ’13, pages 2633–2639. AAAI Press, 2013.
  • [13] Gary King, Jennifer Pan, and Margaret E. Roberts. How censorship in china allows government criticism but silences collective expression. American Political Science Review, 107(2 (May)):1–18, 2013. Please see our followup article published in Science, “Reverse-Engineering Censorship In China: Randomized Experimentation And Participant Observation.”.
  • [14] Kyumin Lee, Brian David Eoff, and James Caverlee. Seven months with the devils: a long-term study of content polluters on twitter. In In AAAI Intl Conference on Weblogs and Social Media (ICWSM, 2011.
  • [15] Sangho Lee and Jong Kim. Early filtering of ephemeral malicious accounts on twitter. Comput. Commun., 54(C):48–57, December 2014.
  • [16] Jacob Ratkiewicz, Michael Conover, Mark Meiss, Bruno Gonçalves, Snehal Patil, and Ro Flammini. Truthy: Mapping the spread of astroturf in microblog streams. In Proceedings of the 20th interna)onal conference companion on World wide web, 2011.
  • [17] Saiph Savage, Andres Monroy-Hernandez, and Tobias Höllerer. Botivist: Calling volunteers to action using online bots. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, CSCW ’16, pages 813–822, New York, NY, USA, 2016. ACM.
  • [18] Julia Silge and David Robinson. tidytext: Text mining and analysis using tidy data principles in r. JOSS, 1(3), 2016.
  • [19] Daniel Sparks. How many users does twitter have?, April 2017. [Online; accessed 06-June-2017].
  • [20] Kurt Thomas, Chris Grier, Dawn Song, and Vern Paxson. Suspended accounts in retrospect: An analysis of twitter spam. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, IMC ’11, pages 243–258, New York, NY, USA, 2011. ACM.
  • [21] Kurt Thomas, Damon McCoy, Chris Grier, Alek Kolcz, and Vern Paxson. Trafficking fraudulent accounts: The role of the underground market in twitter spam and abuse. In Proceedings of the 22Nd USENIX Conference on Security, SEC’13, pages 195–210, Berkeley, CA, USA, 2013. USENIX Association.
  • [22] Alex Hai Wang. Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach, pages 335–342. Springer Berlin Heidelberg, Berlin, Heidelberg, 2010.
  • [23] Samuel Woolley. Automating power: Social bot interference in global politics. First Monday, 21(4), 2016.
  • [24] Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y. Zhao, and Yafei Dai. Uncovering social network sybils in the wild. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, IMC ’11, pages 259–268, New York, NY, USA, 2011. ACM.
  • [25] Yin Zhu, Xiao Wang, Erheng Zhong, Nanthan N. Liu, He Li, and Qiang Yang. Discovering spammers in social networks. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, AAAI’12, pages 171–177. AAAI Press, 2012.