Neural Character-based Composition Models for Abuse Detection

09/02/2018
by   Pushkar Mishra, et al.
0

The advent of social media in recent years has fed into some highly undesirable phenomena such as proliferation of offensive language, hate speech, sexist remarks, etc. on the Internet. In light of this, there have been several efforts to automate the detection and moderation of such abusive content. However, deliberate obfuscation of words by users to evade detection poses a serious challenge to the effectiveness of these efforts. The current state of the art approaches to abusive language detection, based on recurrent neural networks, do not explicitly address this problem and resort to a generic OOV (out of vocabulary) embedding for unseen words. However, in using a single embedding for all unseen words we lose the ability to distinguish between obfuscated and non-obfuscated or rare words. In this paper, we address this problem by designing a model that can compose embeddings for unseen words. We experimentally demonstrate that our approach significantly advances the current state of the art in abuse detection on datasets from two different domains, namely Twitter and Wikipedia talk page.

READ FULL TEXT
research
02/14/2019

Author Profiling for Hate Speech Detection

The rapid growth of social media in recent years has fed into some highl...
research
10/08/2020

Detect All Abuse! Toward Universal Abusive Language Detection Models

Online abusive language detection (ALD) has become a societal issue of i...
research
01/13/2018

Detecting Offensive Language in Tweets Using Deep Learning

This paper addresses the important problem of discerning hateful content...
research
01/31/2016

WASSUP? LOL : Characterizing Out-of-Vocabulary Words in Twitter

Language in social media is mostly driven by new words and spellings tha...
research
06/10/2019

Embedding Imputation with Grounded Language Information

Due to the ubiquitous use of embeddings as input representations for a w...
research
04/17/2020

The FaceChannel: A Light-weight Deep Neural Network for Facial Expression Recognition

Current state-of-the-art models for automatic FER are based on very deep...
research
02/09/2018

URLNet: Learning a URL Representation with Deep Learning for Malicious URL Detection

Malicious URLs host unsolicited content and are used to perpetrate cyber...

Please sign up or login with your details

Forgot password? Click here to reset