A Survey of Toxic Comment Classification Methods

12/13/2021
by   Kehan Wang, et al.
0

While in real life everyone behaves themselves at least to some extent, it is much more difficult to expect people to behave themselves on the internet, because there are few checks or consequences for posting something toxic to others. Yet, for people on the other side, toxic texts often lead to serious psychological consequences. Detecting such toxic texts is challenging. In this paper, we attempt to build a toxicity detector using machine learning methods including CNN, Naive Bayes model, as well as LSTM. While there has been numerous groundwork laid by others, we aim to build models that provide higher accuracy than the predecessors. We produced very high accuracy models using LSTM and CNN, and compared them to the go-to solutions in language processing, the Naive Bayes model. A word embedding approach is also applied to empower the accuracy of our models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2018

Naive Bayes Entrapment Detection for Planetary Rovers

Entrapment detection is a prerequisite for planetary rovers to perform a...
research
07/15/2023

Political Sentiment Analysis of Persian Tweets Using CNN-LSTM Model

Sentiment analysis is the process of identifying and categorizing people...
research
08/31/2023

High Accuracy Location Information Extraction from Social Network Texts Using Natural Language Processing

Terrorism has become a worldwide plague with severe consequences for the...
research
10/21/2020

Gender Prediction Based on Vietnamese Names with Machine Learning Techniques

As biological gender is one of the aspects of presenting individual huma...
research
07/16/2017

Improving Naive Bayes for Regression with Optimised Artificial Surrogate Data

Can we evolve better training data for machine learning algorithms? To i...
research
03/15/2023

Building an Effective Email Spam Classification Model with spaCy

Today, people use email services such as Gmail, Outlook, AOL Mail, etc. ...
research
04/26/2017

Enriching Complex Networks with Word Embeddings for Detecting Mild Cognitive Impairment from Speech Transcripts

Mild Cognitive Impairment (MCI) is a mental disorder difficult to diagno...

Please sign up or login with your details

Forgot password? Click here to reset