DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced Bengali Language

12/28/2020 ∙ by Md. Rezaul Karim, et al. ∙ 28

Exponential growths of social media and micro-blogging sites not only provide platforms for empowering freedom of expressions and individual voices, but also enables people to express anti-social behavior like online harassment, cyberbullying, and hate speech. Numerous works have been proposed to utilize these data for social and anti-social behavior analysis, by predicting the contexts mostly for highly-resourced languages like English. However, some languages such as Bengali are under-resourced that lack of computational resources for natural language processing(NLP). In this paper, we propose an explainable approach for hate speech detection from under-resourced Bengali language, which we called DeepHateExplainer. In our approach, Bengali texts are first comprehensively preprocessed, before classifying them into political, personal, geopolitical, and religious hates, by employing neural ensemble of different transformer-based neural architectures(i.e., monolingual Bangla BERT-base, multilingual BERT-cased and uncased, and XLM-RoBERTa), followed by identifying important terms with sensitivity analysis and layer-wise relevance propagation(LRP) to provide human-interpretable explanations. Evaluations against several machine learning (linear and tree-based models) and deep neural networks (i.e., CNN, Bi-LSTM, and Conv-LSTM with word embeddings) baselines yield F1 scores of 84 geopolitical, and religious hates, respectively, during 3-fold cross-validation tests.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 6

page 8

Code Repositories

bangla-bert

Bangla-Bert is a pretrained bert model for Bengali language


view repo

Bengali-Hate-Speech-Dataset

Dataset for identifying potential hates (e.g., political, religious, personal, gender abusive, geopolitical, etc.) for under-resourced Bengali language.


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.