indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages

02/14/2021
by   Kushal Kedia, et al.
0

The paper presents the submission of the team indicnlp@kgp to the EACL 2021 shared task "Offensive Language Identification in Dravidian Languages." The task aimed to classify different offensive content types in 3 code-mixed Dravidian language datasets. The work leverages existing state of the art approaches in text classification by incorporating additional data and transfer learning on pre-trained models. Our final submission is an ensemble of an AWD-LSTM based model along with 2 different transformer model architectures based on BERT and RoBERTa. We achieved weighted-average F1 scores of 0.97, 0.77, and 0.72 in the Malayalam-English, Tamil-English, and Kannada-English datasets ranking 1st, 2nd, and 3rd on the respective tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2020

LT@Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?

This paper presents the different models submitted by the LT@Helsinki te...
research
01/11/2022

A Feature Extraction based Model for Hate Speech Identification

The detection of hate speech online has become an important task, as off...
research
03/09/2021

Comparing Approaches to Dravidian Language Identification

This paper describes the submissions by team HWR to the Dravidian Langua...
research
09/23/2018

Mind Your Language: Abuse and Offense Detection for Code-Switched Languages

In multilingual societies like the Indian subcontinent, use of code-swit...
research
09/22/2022

AIR-JPMC@SMM4H'22: Classifying Self-Reported Intimate Partner Violence in Tweets with Multiple BERT-based Models

This paper presents our submission for the SMM4H 2022-Shared Task on the...
research
04/23/2021

Towards Trustworthy Deception Detection: Benchmarking Model Robustness across Domains, Modalities, and Languages

Evaluating model robustness is critical when developing trustworthy mode...
research
05/15/2020

KEIS@JUST at SemEval-2020 Task 12: Identifying Multilingual Offensive Tweets Using Weighted Ensemble and Fine-Tuned BERT

This research presents our team KEIS@JUST participation at SemEval-2020 ...

Please sign up or login with your details

Forgot password? Click here to reset