Toxicity Detection can be Sensitive to the Conversational Context

11/19/2021
by   Alexandros Xenos, et al.
0

User posts whose perceived toxicity depends on the conversational context are rare in current toxicity detection datasets. Hence, toxicity detectors trained on existing datasets will also tend to disregard context, making the detection of context-sensitive toxicity harder when it does occur. We construct and publicly release a dataset of 10,000 posts with two kinds of toxicity labels: (i) annotators considered each post with the previous one as context; and (ii) annotators had no additional context. Based on this, we introduce a new task, context sensitivity estimation, which aims to identify posts whose perceived toxicity changes if the context (previous post) is also considered. We then evaluate machine learning systems on this task, showing that classifiers of practical quality can be developed, and we show that data augmentation with knowledge distillation can improve the performance further. Such systems could be used to enhance toxicity detection datasets with more context-dependent posts, or to suggest when moderators should consider the parent posts, which often may be unnecessary and may otherwise introduce significant additional cost.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2020

Toxicity Detection: Does Context Really Matter?

Moderation is crucial to promoting healthy on-line discussions. Although...
research
07/16/2017

Automated Detection of Non-Relevant Posts on the Russian Imageboard "2ch": Importance of the Choice of Word Representations

This study considers the problem of automated detection of non-relevant ...
research
03/01/2022

Understanding Effects of Algorithmic vs. Community Label on Perceived Accuracy of Hyper-partisan Misinformation

Hyper-partisan misinformation has become a major public concern. In orde...
research
09/20/2023

Examining the Limitations of Computational Rumor Detection Models Trained on Static Datasets

A crucial aspect of a rumor detection model is its ability to generalize...
research
04/20/2021

UIT-ISE-NLP at SemEval-2021 Task 5: Toxic Spans Detection with BiLSTM-CRF and Toxic Bert Comment Classification

We present our works on SemEval-2021 Task 5 about Toxic Spans Detection....
research
12/03/2022

Orders Are Unwanted: Dynamic Deep Graph Convolutional Network for Personality Detection

Predicting personality traits based on online posts has emerged as an im...
research
05/11/2022

Identifying Moments of Change from Longitudinal User Text

Identifying changes in individuals' behaviour and mood, as observed via ...

Please sign up or login with your details

Forgot password? Click here to reset