indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages

by   Kushal Kedia, et al.

The paper presents the submission of the team indicnlp@kgp to the EACL 2021 shared task "Offensive Language Identification in Dravidian Languages." The task aimed to classify different offensive content types in 3 code-mixed Dravidian language datasets. The work leverages existing state of the art approaches in text classification by incorporating additional data and transfer learning on pre-trained models. Our final submission is an ensemble of an AWD-LSTM based model along with 2 different transformer model architectures based on BERT and RoBERTa. We achieved weighted-average F1 scores of 0.97, 0.77, and 0.72 in the Malayalam-English, Tamil-English, and Kannada-English datasets ranking 1st, 2nd, and 3rd on the respective tasks.


page 1

page 2

page 3

page 4


LT@Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?

This paper presents the different models submitted by the LT@Helsinki te...

A Feature Extraction based Model for Hate Speech Identification

The detection of hate speech online has become an important task, as off...

Comparing Approaches to Dravidian Language Identification

This paper describes the submissions by team HWR to the Dravidian Langua...

Mind Your Language: Abuse and Offense Detection for Code-Switched Languages

In multilingual societies like the Indian subcontinent, use of code-swit...

AIR-JPMC@SMM4H'22: Classifying Self-Reported Intimate Partner Violence in Tweets with Multiple BERT-based Models

This paper presents our submission for the SMM4H 2022-Shared Task on the...

Towards Trustworthy Deception Detection: Benchmarking Model Robustness across Domains, Modalities, and Languages

Evaluating model robustness is critical when developing trustworthy mode...

Please sign up or login with your details

Forgot password? Click here to reset