QutNocturnal@HASOC'19: CNN for Hate Speech and Offensive Content Identification in Hindi Language

08/28/2020
by   Md Abul Bashar, et al.
0

We describe our top-team solution to Task 1 for Hindi in the HASOC contest organised by FIRE 2019. The task is to identify hate speech and offensive language in Hindi. More specifically, it is a binary classification problem where a system is required to classify tweets into two classes: (a) Hate and Offensive (HOF) and (b) Not Hate or Offensive (NOT). In contrast to the popular idea of pretraining word vectors (a.k.a. word embedding) with a large corpus from a general domain such as Wikipedia, we used a relatively small collection of relevant tweets (i.e. random and sarcasm tweets in Hindi and Hinglish) for pretraining. We trained a Convolutional Neural Network (CNN) on top of the pretrained word vectors. This approach allowed us to be ranked first for this task out of all teams. Our approach could easily be adapted to other applications where the goal is to predict class of a text when the provided context is limited.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

08/23/2020

Augmenting Semantic Representation of Depressive Language: from Forums to Microblogs

We discuss and analyze the process of creating word embedding feature re...
08/28/2020

Misogynistic Tweet Detection: Modelling CNN with Small Datasets

Online abuse directed towards women on the social media platform Twitter...
01/09/2021

Task Adaptive Pretraining of Transformers for Hostility Detection

Identifying adverse and hostile content on the web and more particularly...
04/18/2018

NTUA-SLP at SemEval-2018 Task 3: Tracking Ironic Tweets using Ensembles of Word and Character Level Attentive RNNs

In this paper we present two deep-learning systems that competed at SemE...
04/18/2018

NTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention

In this paper we present a deep-learning model that competed at SemEval-...
12/05/2017

EmTaggeR: A Word Embedding Based Novel Method for Hashtag Recommendation on Twitter

The hashtag recommendation problem addresses recommending (suggesting) o...
08/28/2020

Temporal Random Indexing of Context Vectors Applied to Event Detection

In this paper we explore new representations for encoding language data....
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.