Machine Learning Suites for Online Toxicity Detection

10/03/2018
by   David Noever, et al.
4

To identify and classify toxic online commentary, the modern tools of data science transform raw text into key features from which either thresholding or learning algorithms can make predictions for monitoring offensive conversations. We systematically evaluate 62 classifiers representing 19 major algorithmic families against features extracted from the Jigsaw dataset of Wikipedia comments. We compare the classifiers based on statistically significant differences in accuracy and relative execution time. Among these classifiers for identifying toxic comments, tree-based algorithms provide the most transparently explainable rules and rank-order the predictive contribution of each feature. Among 28 features of syntax, sentiment, emotion and outlier word dictionaries, a simple bad word list proves most predictive of offensive commentary.

READ FULL TEXT

page 6

page 7

research
10/02/2018

Who is Addressed in this Comment? Automatically Classifying Meta-Comments in News Comments

User comments have become an essential part of online journalism. Howeve...
research
09/06/2021

Data Science Kitchen at GermEval 2021: A Fine Selection of Hand-Picked Features, Delivered Fresh from the Oven

This paper presents the contribution of the Data Science Kitchen at Germ...
research
06/17/2020

Using Sentiment Information for Preemptive Detection of Toxic Comments in Online Conversations

The challenge of automatic detection of toxic comments online has been t...
research
07/02/2020

Can We Achieve More with Less? Exploring Data Augmentation for Toxic Comment Classification

This paper tackles one of the greatest limitations in Machine Learning: ...
research
05/01/2022

Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

Machine learning models are commonly used to detect toxicity in online c...
research
09/09/2020

Regularised Text Logistic Regression: Key Word Detection and Sentiment Classification for Online Reviews

Online customer reviews have become important for managers and executive...
research
01/30/2019

Classifier Suites for Insider Threat Detection

Better methods to detect insider threats need new anticipatory analytics...

Please sign up or login with your details

Forgot password? Click here to reset