A Trolling Hierarchy in Social Media and A Conditional Random Field For Trolling Detection

An-ever increasing number of social media websites, electronic newspapers and Internet forums allow visitors to leave comments for others to read and interact. This exchange is not free from participants with malicious intentions, which do not contribute with the written conversation. Among different communities users adopt strategies to handle such users. In this paper we present a comprehensive categorization of the trolling phenomena resource, inspired by politeness research and propose a model that jointly predicts four crucial aspects of trolling: intention, interpretation, intention disclosure and response strategy. Finally, we present a new annotated dataset containing excerpts of conversations involving trolls and the interactions with other users that we hope will be a useful resource for the research community.



page 1

page 2

page 3

page 4


Modeling Trolling in Social Media Conversations

Social media websites, electronic newspapers and Internet forums allow v...

Conversation Graphs in Online Social Media

In online social media platforms, users can express their ideas by posti...

bitsa_nlp@LT-EDI-ACL2022: Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments

Online social networks are ubiquitous and user-friendly. Nevertheless, i...

The Clickbait Challenge 2017: Towards a Regression Model for Clickbait Strength

Clickbait has grown to become a nuisance to social media users and socia...

Voice for the Voiceless: Active Sampling to Detect Comments Supporting the Rohingyas

The Rohingya refugee crisis is one of the biggest humanitarian crises of...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In contrast to traditional content distribution channels like television, radio and newspapers, Internet opened the door for direct interaction between the content creator and its audience. One of these forms of interaction is the presence of comments sections that are found in many websites. The comments section allows visitors, authenticated in some cases and unauthenticated in others, to leave a message for others to read. This is a type of multi-party asynchronous conversation that offers interesting insights: one can learn what is the commenting community thinking about the topic being discussed, their sentiment, recommendations among many other. There are some comment sections in which commentators are allowed to directly respond to others, creating a comment hierarchy. These kind of written conversations are interesting because they bring light to the types interaction between participants with minimal supervision. This lack of supervision and in some forums, anonymity, give place to interactions that may not be necessarily related with the original topic being discussed, and as in regular conversations, there are participants with not the best intentions. Such participants are called trolls in some communities.

Even though there are some studies related to trolls in different research communities, there is a lack of attention from the NLP community. We aim to reduce this gap by presenting a comprehensive categorization of trolling and propose two models to predict trolling aspects. First, we revise the some trolling definitions: “Trolling is the activity of posting messages via communication networks that are in tended to be provocative, offensive or menacing” by [Bishop2013], this definition considers trolling from the most negative perspective where a crime might be committed. In a different tone, [Hardaker2010] provides a working definition for troll: “A troller in a user in a computer mediated communication who constructs the identity of sincerely wishing to be part of the group in question, including professing, or conveying pseudo-sincere intentions, but whose real intention(s) is/are to cause disruption and/or trigger or exacerbate conflict for the purpose of their own amusement”. These definitions inspire our trolling categorization, but first, we define a trolling event: a comment in a conversation whose intention is to cause conflict, trouble; be malicious, purposely seek or disseminate false information or advice; give a dishonest impression to deceive; offend, insult, cause harm, humiliation or aggravation. Also, a troll or troller is the individual that generates a trolling event, trolling is the overall phenomena that involves a troll, trolling event and generates responses from others. Any participant in a forum conversation may become a troll at any given point, as we will see, the addressee of a trolling event may choose to reply with a trolling comment or counter-trolling, effectively becoming a troll as well.

We believe our work makes four contributions. First, unlike previous computational work on trolling, which focused primarily on analyzing the narrative retrospectively by the victim (e.g., determining the trolling type and the role played by each participant), we study trolling by analyzing comments in a conversation, aiming instead to identify trollers, who, once identified, could be banned from posting. Second, while previous work has focused on analyzing trolling from the troll’s perspective, we additionally model trolling from the target’s perspective, with the goal understanding the psychological impact of a trolling event on the target, which we believe is equally important from a practical standpoint. Third, we propose a comprehensive categorization of trolling that covers not only the troll’s intention but also the victim and other commenters’ reaction to the troll’s comment. We believe such categorization will provide a solid basis on which future computational approaches to trolling can be built. Finally, we make our annotated data set consisting of 1000 annotated trolling events publicly available. We believe that our data set will be a valuable resource to any researcher interested in the computational modeling of trolling.

1.1 Trolling Categorization

Based on the previous definitions we identify four aspects that uniquely define a trolling event-response pair: 1) Intention: what is the author of the comment in consideration purpose, a) trolling, the comment is malicious in nature, aims to disrupt, annoy, offend, harm or spread purposely false information, b) playing the comment is playful, joking, teasing others without the malicious intentions as in a), or c) none, the comment has no malicious intentions nor is playful, it is a simple comment. 2) Intention Disclosure: this aspect is meant to indicate weather a trolling comment is trying to deceive its readers, the possible values for this aspect are a) the comment’s author is a troll and is trying to hide its real intentions, and pretends to convey a different meaning, at least temporarily, b) the comment’s author is a troll but is clearly exposing its malicious intentions and c) the comment’s author is not a troll, therefore there are not hidden or exposed malicious or playful intentions. There are two aspects defined on the comments that direct address the comment in consideration, 3) Intentions Interpretation: this aspect refers to the responder’s understanding of the parent’s comment intentions. There possible interpretations are the same as the intentions aspect: trolling, playing or none. The last element, is the 4) Response strategy employed by the commentators directly replaying to a comment, which can be a trolling event. The response strategy is influenced directly by the responder’s interpretation of the parent’s comment intention. We identify 14 possible response strategies. Some of these strategies are tied with combinations of the three other aspects. We briefly define each of them in the appendix.

Figure 1 shows this categories as a hierarchy. Using this trolling formulation, the suspected troll event and the responses are correlated and one cannot independently name the strategy response without learning about the other three aspects. This is challenging prediction problem that we address in this work.

Figure 1: Trolling categorization based on four aspects: Comment’s Intention and Intentions Disclosure, and Response’s Interpretation and Strategy

1.2 Conversations Excerpts Examples

To illustrate this hierarchy, we present some examples. These are excerpts from original conversations; the first comment, generated by author C0, on each excerpt is given as a minimal piece of context, the second comment, by the author C1 in italics, is the comment suspected to be a trolling event. The rest of the comments, are all direct responses to the suspected trolling comment. When the response author “name” is the same as the first comment, it indicates that the that same individual also replied to the suspected troll.

Example 1.

  • [noitemsep,nolistsep]

  • My friend who makes $20,000 a year leased a brand new Chevy Spark EV, for only $75 per month and he got a California rebate for driving an electric car. Much cheaper than buying older car which usually require heavy upkeep due to its mileage. At this point I think you’re just trolling.

    • [noitemsep,nolistsep]

    • IYour friend has a good credit score, which can’t be said about actual poor people. Did you grow up sheltered by any chance?

      • [noitemsep,nolistsep]

      • Judging by your post history, you’re indeed a troll. Have a good one.

In this example, when C1 asks “Did you grow up sheltered by any chance?”, her intention is to denigrate or be offensive, and it is not hiding it, instead he is clearly disclosing her trolling intentions. In C0’s response, we see that has came to the conclusion that C1 is trolling and his response strategy is frustrate the trolling event by ignoring the malicious troll’s intentions.

Example 2.

  • [noitemsep,nolistsep]

  • What do you mean look up ?:( I don’t see anything lol

    • [noitemsep,nolistsep]

    • Look up! Space is cool! :)

      • [noitemsep,nolistsep]

      • why must you troll me :(

      • Keep going, no matter how many times you say it, he will keep asking

In this example, we hypothesize that C0 is requesting some information and C1 is given an answer that is unfit to C0’s’ request. We do so based on the last C0’s comment; CO is showing disappointment or grievance. Also, we deduct that C1 is trying to deceive C0, therefore, C1’s comment is a trolling event. This is a trolling event whose intention is to purposely convey false information, and that hiding its intentions. As for the response, in the last C0’s comment, he has finally realized or interpreted that C1’s real intentions are deceiving and since his comment shows a “sad emoticon” his reply is emotionally, with aggravation, so we say that CO got engaged. C2 on the other hand, acknowledges the malicious and play along with the troll.

Given these examples, address the task of predicting the four aspects of a trolling event based on the methodology described in the next section.

2 Corpus and Annotations

We collected all available comments in the stories from Reddit111https://www.reddit.com/ from August 2015. 222Reddit user Stuck_In_the_Matrix downloaded all comments from Reddit’s api from which we selected a subset. Reddit is popular website that allows registered users (without identity verification) to participate in forums specific a post or topic. These forums are of they hierarchical type, those that allow nested conversation, where the children of a comment are its direct response. To increase recall and make the annotation process feasible we created an inverted index with Lucene 333https://lucene.apache.org/ and queried for comments containing the word troll with an edit distance of 1, to include close variations of this word. We do so inspired by the method by [Xu et al.2012a] to created a bullying dataset, and because we hypothesize that such comments will be related or involved in a trolling event. As we observed in the dataset, people use the word troll in many different ways, sometimes it is to point out that some used is indeed trolling him or her or is accusing someone else of being a troll. Other times, people use the term, to express their frustration or dislike about a particular user, but there is no trolling event. Other times, people simple discuss about trolling and trolls, without actually participating or observing one directly. Nonetheless, we found that this search produced a dataset in which 44.3 % of the comments directly involved a trolling event. Moreover, as we exposed our trolling definition, it is possible for commentators in a conversation to believe that they are witnessing a trolling event and respond accordingly even where there is none. Therefore, even in the comments that do not involve trolling, we are interested in learning what triggers users interpretation of trolling where it is not present and what kind of response strategies are used. We define as a suspected trolling event in our dataset a comment in which at least one of its children contains the word troll.

With the gathered comments, we reconstructed the original conversation trees, from the original post, the root, to the leaves, when they were available444We removed the comments whose text had been deleted, or the user is marked as [deleted] and selected a subset to annotated. For annotation purposes, we created snippets of conversations as the ones shown in Example 1 and Example 2 consisting of the parent of the suspected trolling event, the suspected trolling event comment, and all of the direct responses to the suspected trolling event. We added an extra constraint that the parent of the suspected trolling event should also be part of the direct responses, we hypothesize that if the suspected trolling event is indeed trolling, its parent should be the object of its trolling and would have a say about it. We recognize that this limited amount of information is not always sufficient to recover the original message conveyed by all of the participants in the snippet, and additional context would be beneficial. However, the trade off is that snippets like this allow us to make use of Amazon Mechanical Turk (AMT) to have the dataset annotated, because it is not a big burden for a “turker” to work on an individual snippet in exchange for a small pay, and expedites the annotation process by distributing it over dozens of people. Specifically, for each snippet, we requested three annotators to label the four aspects previously described. Before annotating, we set up a qualification test along with borderline examples to guide them in process and align them with our criteria. The qualification test turned out to be very selective since only 5% of all of the turkers that attempted it passed the exam. Our dataset consists of 1000 conversations with 5868 sentences and 71033 tokens. The distribution over the classes per trolling aspect is shown in the table 1 in the column “Size”.

Inter-Annotator Agreement. Due to the subjective nature of the task we did not expected perfect agreement. However, we obtained substantial inter-annotator agreement as we measured the fleiss-kappa statistic [Fleiss and Cohen1973] for each of the trolling aspects: Intention: 0.578, Intention Disclosure: 0.556, Interpretation: 0.731 and Response 0.632. After inspecting the dataset, we manually reconciled aspects of the threads that found no majority on the turkers annotation and verified and corrected consistency on the four tasks on each thread.

Figure 2: Trolling tasks modeled as a (conditional) probabilistic graphical model (left). Factor graph showing all cliques or direct interactions in the model (right). is the number of direct responses to the suspected trolling comment.

3 Trolling Events Prediction

In this section we propose to solve the following problem: given a comment in a conversation, suspected to a trolling event, it’s parent comment and all it’s direct responses, we aim to predict the suspected comment I: intention, its D: intention disclosure and from the responses point of view, for each response comment the R: interpretation of the suspected troll comment’s intentions, and identify its B: response strategy. This problem can be seen as a multi-task prediction. To do so, we split the dataset into training and testing sets using a 5-fold cross validation setup.

3.1 Feature Set

For prediction we define two sets of features, a basic and an enhanced dataset, extracted from each of the comments in the dataset. The features are described below.

3.1.1 Basic Feature Set

N-gram features. We encode each unigram and bigram collected from the training comments a binary feature. In a similar manner, we include the unigram and bigram along with their POS tag as in [Xu et al.2012a]. To extract these features we used the most current version of the Stanford CoreNLP [Manning et al.2014]. Each token’s Lemmas as in [Xu et al.2012b] as a binary feature.

Harmful Vocabulary. In their research on bullying [Nitta et al.2013] identified a small set of words that are highly offensive. We encode them as well as binary features.

Emotions Synsets. As in [Xu et al.2012b] we extracted all lemmas associated with each of Synsets extracted from WordNet [Miller1995] from these emotions: anger, embarrassment, empathy, fear, pride, relief and sadness. As well all the synonyms from these emotions extracted from the dictionary. Also,

3.1.2 Enhanced Feature Set

Emoticons. Reddit’s comments make extensive use of emoticons, we argue that some emoticons are specially used in trolling events and to express a variety of emotions, which we hypothesize would be useful to identify a comments intention, interpretation and response. For that we use the emoticon dictionary [Hogenboom et al.2015] and we set a binary feature for each emoticon that is found in the dictionary.

Sentiment Polarity. Using a similar idea, we hypothesize that the overall comment emotion would be useful to identify the response and intention in a trolling event. So, we apply the Vader Sentiment Polarity Analyzer [Hutto and Gilbert2014] and include a four features, one per each measurement given by the analyzer: positive, neutral, negative and a composite metric, each as a real number value.

Subjectivity Lexicon

. From the MPQA Subjective Lexicon

[Wilson et al.2005] we include all tokens that are found in the lexicon as binary features. This lexicon was created from a news domains, so the words in it don’t necessarily align with the informal vocabulary used in Reddit, but, there are serious Reddit users that use proper language and formal constructions. We believe that these features will allow us to discriminate formal comments from being potentially labeled as trolling events, which tend to be vulgar.

Swearing Vocabulary. We manually collected 1061 swear words and short phrases from the internet, blogs, forums and smaller repositories. The informal nature of this dictionary resembles the type of language used by flaming trolls and agitated responses, so we encode a binary feature for each word or short phrase in a comment if it appears in the swearing dictionary.

Framenet. Following [Hasan and Ng2014] use of FrameNet, we apply the Semaphore Parser [Das et al.2014] to each sentence in every comment in the training set, and construct three different binary features: every frame name that is present in the sentence, the frame name a the target word associated with it, and the argument name along with the token or lexical unit in the sentence associated with it. We argue that some frames are especially interesting from the trolling perspective. For example, the frame “Deception_success” precisely models one of the trolling models, and we argue that these features will be particularly to identify trolling events in which semantic and not just syntactic information is necessary.

Politeness Queues. [Danescu-Niculescu-Mizil et al.2013] identified queues that signal polite and impolite interactions among groups of people collaborating online. Based on our observations of trolling examples, it is clear that flaming troll and engaged or emotional responses would use impolite queues. On the contrary, neutralizing and frustrating responses to troll avoid falling in confrontation and their vocabulary tends to be more polite. So use these queues as binary features as they appear in the comments in consideration.

3.2 Baseline System

The most naïve approach is to consider each of the four tasks as an independent classification problem. Such system would be deprived from the other’s tasks information that we’ve mentioned is strictly necessary to make a correct prediction of the response strategy. Instead, as our baseline we follow a pipeline approach, using the tasks oder: I, D, R and B, so that each of the subsequent subtasks’ feature set is extended with a feature per each of previously computed subtasks. We argue that this setup is a competitive baseline, as can be verified in the results table 1

. For the classifier in the pipeline approach we choose a log-linear model, a logistic regression

555We use scikit-learn[Pedregosa et al.2011] implementation for all baseline experiments.

classifier. In addition to logistic regression, we tried the generative complement of logistic regression, naïve bayes and max-margin classifier, a support vector machine, but their performance was not superior to the logistic regression. It is noteworthy to mention that the feature set used for the

intention predict is the combined features sets of the suspected troll comment as well as its parent. We do so in all of our experiments the learner can take advantage of the conversation context.

3.3 Joint Models

The nature of this problem makes the use of a joint model a logical choice. Among different options for joint inference, we choose a (conditional) probabilistic graphical model (henceforth PGM)[Koller and Friedman2009] because, in contrast to ILP formulations, has the ability to learn parameters and not just impose hard constraints. Also, compared to Markov Logic Networks [Richardson and Domingos2006], a relatively recent formulation that combines logic and Markov Random Fields, PGMs in practice have proved to be more scalable, even though, inference in general models is shown to be intractable. Finally, we are also interested in choosing a PGM because it allow to directly compare the strength of joint inference with the baseline, because the our model is a collection of logistic regressors trained simultaneously.

A conditional random field factorizes the conditional probability distribution over all possible values of the query variables in the model, given a set of observations as in equation

1. In our model, the query variables are the four tasks we desire to predict, and the observations is their combined feature sets . Each of the factors in this distribution is a log-linear model as in equation 2 and represents the probability distribution of the clique of variables in it, given the observation set . This is identical to the independent logistic regression model described in the baseline, except for the fact that all variables or tasks are consider a the same time. To do so, we add additional factors that connect task variables among them, permitting the flow of information of one task to the other.

Specifically, our model represent each task with a random variable, shown in figure

2 (left), represented by the circles. The plate notation that surrounds variables and indicates that there will as many variables and and edges connecting them to as the number of responses in the problem snippet. The edges connecting and with attempts to model influence of these two variables on the response, and how this information is passed along to the response strategy variable . Figure 2(right) explicitly represents the cliques in the underlying factor graph. We can see that there are unary factors, , , and , that model the influence of the observation features over their associated variables, just as the logistic regression model does. Factors models the interaction between variables and , the interaction between variables and and models the interactions between variables and , using a log-linear model over the possible values of the pair of variables in that particular clique.

Due to the size of the model, we are able to perform exact inference at train and test time. For parameter learning we employ limited memory lbfgs optimizer [Byrd et al.1995] as we provide the cost function and gradient based on the equations described in [Sutton and McCallum2006].

2 pass Model A hybrid mode that we experiment with is model that performs joint inference on three tasks: I: intention, D: intention disclosure and R: responders’ intention interpretation. The remaining task B: response strategy is performed in a second step, with the input the other three tasks. We do so because we observed in our experiments that the close coupling between the first three tasks allow them to perform better independently of the response strategy, as we will elaborate in the results section.


4 Evaluation and Results

We perform 5-fold cross validation on the dataset. We use the first fold to tune the parameters and the remaining four folds to report results. The system performance is measured using precision, recall and F-1 as shown in table 1. The left side of the table, reports results obtained using the basic feature set, while the right side does so on the enhanced feature set. In order to maintain consistency folds are created based on the threads or snippets and for the case of the baseline system, all instances in the particular fold for task in consideration are considered independent of each other. On the table, rows show the classes performance for each of the tasks, indicated by a heard with the task name. For the response strategy we present results for those class values that are at least 5% of the total distribution, we do so, because the number of labeled instances for this classes is statistically insignificant compared to the majority classes.

Basic Feature Set Enhanced Feature Set Size
Baseline Full Joint Joint + 2nd Pass Baseline Full Joint Joint + 2nd Pass -
Experiment/Class P R F1 P R F1 P R F1 P R F1 P R F1 P R F1 %
I: Intention
No trolling 64.0 72.8 68.2 91.4 29.6 44.6 91.6 89.0 90.2 64.2 72.2 67.8 92.4 33.8 49.2 89.2 90.8 90.0 55.7
Trolling 57.2 50.4 53.4 50.0 98.2 66.4 87.8 93.0 90.4 56.4 50.4 52.8 51.6 98.2 67.4 88.8 89.4 89.2 41.7
Playing 0.0 0.0 0.0 0.0 0.0 0.0 80.0 54.0 63.6 0.0 0.0 0.0 0.0 0.0 0.0 80.0 50.0 60.8 2.6
D: Disclosure
None 67.0 78.0 71.8 80.0 91.4 85.2 89.6 90.6 90.0 67.6 78.0 72.2 81.6 80.2 80.8 89.4 91.0 90.2 55.4
Hidden 0.0 0.0 0.0 80.0 22.0 34.0 80.8 62.0 69.2 0.0 0.0 0.0 80.0 8.6 15.2 75.8 67.2 69.8 8.4
Exposed 56.6 55.4 55.4 80.0 77.4 78.8 84.2 88.4 86.0 57.2 56.8 56.4 68.2 84.0 75.2 86.0 86.2 86.2 36.2
R: Interpretation
No trolling 71.0 63.2 66.6 84.6 36.2 50.4 88.2 76.6 82.0 71.0 63.6 67.0 87.0 41.2 55.8 85.4 81.6 83.0 38.9
Trolling 77.2 84.0 80.6 69.6 96.8 81.0 85.8 94.6 90.0 77.4 83.8 80.4 71.6 96.8 82.4 87.8 92.2 90.0 60.0
Playing 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.1
B: Response
Engage 30.6 31.6 30.4 0.0 0.0 0.0 34.2 31.8 31.8 29.4 32.0 29.4 0.0 0.0 0.0 33.6 30.4 30.0 12.4
False Accusation 20.4 24.8 22.0 10.0 0.6 1.2 24.2 30.8 26.8 21.8 27.0 23.8 20.0 0.6 1.2 23.4 28.0 25.2 14.4
Neutralize 26.0 39.2 31.0 18.8 98.6 31.6 28.6 40.6 33.4 26.4 37.6 30.8 19.4 98.0 32.0 26.2 38.0 30.6 15.7
Normal 41.4 58.6 48.4 46.6 41.0 43.6 41.8 60.8 49.4 40.2 58.6 48.0 45.8 44.6 45.2 40.2 60.0 47.8 18.4
Frustrate 15.0 14.8 14.4 0.0 0.0 0.0 13.0 12.0 11.8 15.2 14.2 14.2 20.0 1.0 2.0 13.2 12.0 12.2 9.9
Imaginary Bite 22.2 8.8 11.6 0.0 0.0 0.0 30.6 9.2 13.0 20.4 8.8 11.4 0.0 0.0 0.0 22.4 9.2 12.4 5.8
Bite Attempt 17.6 13.2 15.0 0.0 0.0 0.0 19.4 14.0 15.6 13 10.2 11.4 0.0 0.0 0.0 16.0 11.8 13.4 9.6
Table 1: Prediction Results for the four aspects of trolling: Intention, Intentions Disclosure, Interpretation, and Response strategy. Three models are evaluated: a logistic regression classifier: Baseline, a four-tasks CRF: Full Joint, and a two steps process: three-tasks CRF followed by the Response Strategy prediction tasks given the the outcome of the CRF

4.1 Results Discussion

From the result table 1, we observe that hybrid model significantly outperform the baseline, by more than 20 points in intention and intention disclosure prediction. For the response strategy, it is clear that none of the systems offer satisfying results; this showcases the difficult of such a large number of classes. Nonetheless, the hybrid model outperforms the fully joint model and baseline in all but one the response strategy classes. However, the differences are far less impressive as in the other tasks. It is surprisingly; that the full joint model did not offered the best performance. One of the reasons behind this is that intention, intentions disclosure and interpretation tasks are hurt by the difficulty of learning parameters that maximize the response strategy, this last task drags the other three down in performance. Another reason is that, the features response strategy is not informative enough to learn the correct concept, and due to the joint inference process, all tasks receive a hit. Also, it is counter-intuitive that the augmented set of features did not outperform in all tasks but in intentions disclosure and interpretation, and just by a small margin. A reason explaining this unexpected behavior is that the majority of enhanced features are already represented in the basic feature set by means of the unigrams and bigrams, and the Framenet and Sentiment features are uninformative or redundant. Lastly, we observe that for interpretation category, none of systems were able to predict the “playing” class. This is because of the relative size of the number of instances annotated with that value, 1% of the entire dataset. We hypothesize those instances labeled by the annotators, of which a majority agreed on, incorrectly selected the playing category instead of the trolling class, and that, at the interpretation level, one can only expect to reliably differentiate between trolling and trolling.

5 Related Work

In this section, we discuss related work in the areas of trolling, bullying and politeness, as they intersect in their scope and at least partially address the problem presented in this work.

[Mihaylov et al.2015] address the problem of identifying manipulation trolls in news community forums. The major difference with this work is that all their predictions are based on meta-information such as number of votes, dates, number of comments and so on. There is no NLP approach to the problem and their task is limited to identifying trolls. [Bishop2013] and [Bishop2014] elaborate a deep description of the trolls personality, motivations, effects on the community that trolls interfere and the criminal and psychological aspects of trolls. Their main focus are flaming trolls, but have no NLP insights do not propose and automated prediction tasks as in this work. In a networks related framework [Kumar et al.2014] and [Guha et al.2004] present a methodology to identify malicious individuals in a network based solely on the network’s properties. Even though they offer present and evaluate a methodology, their focus is different from NLP. [Cambria et al.2010] proposes a method that involves NLP components, but fails to provide a evaluation of their system. Finally, [Xu et al.2012a] and [Xu et al.2012b] address bullying traces. That is self reported events of individuals describing being part of bullying events, but their focus is different from trolling event and the interactions with other participants.

6 Conclusion and Future Work

In this paper we address the under-attended problem of trolling in Internet forums. We presented a comprehensive categorization of trolling events and defined a prediction tasks that does not only considers trolling from the troll’s perspective but includes the responders to the trolls comment. Also, we evaluated three different models and analyzed their successes and shortcomings. Finally we provide an annotated dataset which we hope will be useful for the research community. We look forward to investigate trolling phenomena in larger conversations, formalize the concepts of changing roles among the participants in trolling events, and improve response strategy performance.

Appendix A Response Strategy Definitions

  1. [noitemsep,nolistsep]

  2. Normal: unemotional and standard response to a comment that is not a trolling event.

  3. Bite Attempt: response comment with malicious intentions to a parent comment that is not a trolling event

  4. Imaginary Bite: reply with aggravation thinking that he/she is being trolled but without the intention of trolling back.

  5. False accusation: comment criticizing or denouncing malicious intentions of parent’s comment, but the parent comment has no malicious intentions.

  6. Frustrate: comment that acknowledges the parent’s malicious or playful intentions but refuses to engage or fall in the provocation as it gives no importance to them.

  7. Neutralize: comment that acknowledges the malicious or playful intentions and tries to minimize or criticize them

  8. Counter Trolling: comment that acknowledges the malicious or playful intentions and attempts to troll back with a trolling event

  9. Praise: comment that acknowledges the malicious or playful intentions, but positively recognize the troll’s ingenuity or ability

  10. Engage: comment that acknowledges the malicious or playful intentions and falls in the provocation, giving a discomposed response.

  11. Aggravation: comment that acknowledges the non-malicious or playful intentions, but they are received emotionally

  12. Confrontation: annoyed response based on the belief of malicious intentions when the original intentions are playful.

  13. Failed: comment response in which the responder doesn’t realized the malicious or playful intentions, but does not fall in the provocation. We say that the troll failed.

  14. Bite: comment response that falls in the malicious intentions of a deceiving trolling event or a upset reply to a flaming trolling event

  15. Follow: comment response in which the responder understand the the malicious or playful intentions and plays along with the troll.


  • [Bishop2013] Jonathan Bishop. 2013. The effect of de-individuation of the internet troller on criminal procedure implementation: An interview with a hater. International Journal of Cyber Criminology, 7(1):28.
  • [Bishop2014] Jonathan Bishop. 2014. Representations of ‘trolls’ in mass media communication: a review of media-texts and moral panics relating to ‘internet trolling’. International Journal of Web Based Communities, 10(1):7–24.
  • [Byrd et al.1995] Richard H Byrd, Peihuang Lu, Jorge Nocedal, and Ciyou Zhu. 1995. A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16(5):1190–1208.
  • [Cambria et al.2010] Erik Cambria, Praphul Chandra, Avinash Sharma, and Amir Hussain. 2010. Do not feel the trolls. ISWC, Shanghai.
  • [Danescu-Niculescu-Mizil et al.2013] Cristian Danescu-Niculescu-Mizil, Moritz Sudhof, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. A computational approach to politeness with application to social factors. arXiv preprint arXiv:1306.6078.
  • [Das et al.2014] Dipanjan Das, Desai Chen, André FT Martins, Nathan Schneider, and Noah A Smith. 2014. Frame-semantic parsing. Computational Linguistics, 40(1):9–56.
  • [Fleiss and Cohen1973] Joseph L Fleiss and Jacob Cohen. 1973. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and psychological measurement.
  • [Guha et al.2004] Ramanthan Guha, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. 2004. Propagation of trust and distrust. In Proceedings of the 13th international conference on World Wide Web, pages 403–412. ACM.
  • [Hardaker2010] Claire Hardaker. 2010. Trolling in asynchronous computer-mediated communication: From user discussions to academic definitions. Journal of Politeness Research, 6(2):215–242.
  • [Hasan and Ng2014] Kazi Saidul Hasan and Vincent Ng. 2014. Why are you taking this stance? identifying and classifying reasons in ideological debates. In EMNLP, pages 751–762.
  • [Hogenboom et al.2015] Alexander Hogenboom, Daniella Bal, Flavius Frasincar, Malissa Bal, Franciska De Jong, and Uzay Kaymak. 2015. Exploiting emoticons in polarity classification of text. J. Web Eng., 14(1&2):22–40.
  • [Hutto and Gilbert2014] Clayton J Hutto and Eric Gilbert. 2014.

    Vader: A parsimonious rule-based model for sentiment analysis of social media text.

    In Eighth International AAAI Conference on Weblogs and Social Media.
  • [Koller and Friedman2009] Daphne Koller and Nir Friedman. 2009. Probabilistic graphical models: principles and techniques. MIT press.
  • [Kumar et al.2014] Sudhakar Kumar, Francesca Spezzano, and VS Subrahmanian. 2014. Accurately detecting trolls in slashdot zoo via decluttering. In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on, pages 188–195. IEEE.
  • [Manning et al.2014] Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014.

    The Stanford CoreNLP natural language processing toolkit.

    In Association for Computational Linguistics (ACL) System Demonstrations, pages 55–60.
  • [Mihaylov et al.2015] Todor Mihaylov, Georgi D Georgiev, AD Ontotext, and Preslav Nakov. 2015. Finding opinion manipulation trolls in news community forums. In Proceedings of the Nineteenth Conference on Computational Natural Language Learning, CoNLL, volume 15, pages 310–314.
  • [Miller1995] George A Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41.
  • [Nitta et al.2013] Taisei Nitta, Fumito Masui, Michal Ptaszynski, Yasutomo Kimura, Rafal Rzepka, and Kenji Araki. 2013. Detecting cyberbullying entries on informal school websites based on category relevance maximization. In IJCNLP, pages 579–586.
  • [Pedregosa et al.2011] F. Pedregosa, G. Varoquaux, and Gramfort. 2011.

    Scikit-learn: Machine learning in Python.

    Journal of Machine Learning Research, 12:2825–2830.
  • [Richardson and Domingos2006] Matthew Richardson and Pedro Domingos. 2006. Markov logic networks. Machine learning, 62(1-2):107–136.
  • [Sutton and McCallum2006] Charles Sutton and Andrew McCallum. 2006. An introduction to conditional random fields for relational learning. Introduction to statistical relational learning, pages 93–128.
  • [Wilson et al.2005] Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing, pages 347–354. Association for Computational Linguistics.
  • [Xu et al.2012a] Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. 2012a. Learning from bullying traces in social media. In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies, pages 656–666. Association for Computational Linguistics.
  • [Xu et al.2012b] Jun-Ming Xu, Xiaojin Zhu, and Amy Bellmore. 2012b. Fast learning for sentiment analysis on bullying. In Proceedings of the First International Workshop on Issues of Sentiment Discovery and Opinion Mining, page 10. ACM.