Divide and Conquer: An Ensemble Approach for Hostile Post Detection in Hindi

01/20/2021
by   Varad Bhatnagar, et al.
5

Recently the NLP community has started showing interest towards the challenging task of Hostile Post Detection. This paper present our system for Shared Task at Constraint2021 on "Hostile Post Detection in Hindi". The data for this shared task is provided in Hindi Devanagari script which was collected from Twitter and Facebook. It is a multi-label multi-class classification problem where each data instance is annotated into one or more of the five classes: fake, hate, offensive, defamation, and non-hostile. We propose a two level architecture which is made up of BERT based classifiers and statistical classifiers to solve this problem. Our team 'Albatross', scored 0.9709 Coarse grained hostility F1 score measure on Hostile Post Detection in Hindi subtask and secured 2nd rank out of 45 teams for the task. Our submission is ranked 2nd and 3rd out of a total of 156 submissions with Coarse grained hostility F1 score of 0.9709 and 0.9703 respectively. Our fine grained scores are also very encouraging and can be improved with further finetuning. The code is publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2021

Combating Hostility: Covid-19 Fake News and Hostile Post Detection in Social Media

This paper illustrates a detail description of the system and its result...
research
11/12/2018

Classifying Patent Applications with Ensemble Methods

We present methods for the automatic classification of patent applicatio...
research
01/13/2021

LaDiff ULMFiT: A Layer Differentiated training approach for ULMFiT

In our paper, we present Deep Learning models with a layer differentiate...
research
12/31/2021

Hypers at ComMA@ICON: Modelling Aggressiveness, Gender Bias and Communal Bias Identification

Due to the exponentially increasing reach of social media, it is essenti...
research
11/11/2019

Understanding BERT performance in propaganda analysis

In this paper, we describe our system used in the shared task for fine-g...
research
10/15/2022

Large Language Models for Multi-label Propaganda Detection

The spread of propaganda through the internet has increased drastically ...
research
06/05/2020

Spoken dialect identification in Twitter using a multi-filter architecture

This paper presents our approach for SwissText KONVENS 2020 shared t...

Please sign up or login with your details

Forgot password? Click here to reset