IIITG-ADBU@HASOC-Dravidian-CodeMix-FIRE2020: Offensive Content Detection in Code-Mixed Dravidian Text

07/29/2021
by   Arup Baruah, et al.
0

This paper presents the results obtained by our SVM and XLM-RoBERTa based classifiers in the shared task Dravidian-CodeMix-HASOC 2020. The SVM classifier trained using TF-IDF features of character and word n-grams performed the best on the code-mixed Malayalam text. It obtained a weighted F1 score of 0.95 (1st Rank) and 0.76 (3rd Rank) on the YouTube and Twitter dataset respectively. The XLM-RoBERTa based classifier performed the best on the code-mixed Tamil text. It obtained a weighted F1 score of 0.87 (3rd Rank) on the code-mixed Tamil Twitter dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/05/2020

Gauravarora@HASOC-Dravidian-CodeMix-FIRE2020: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection

This paper describes the system submitted to Dravidian-Codemix-HASOC2020...
research
09/20/2021

Language Identification with a Reciprocal Rank Classifier

Language identification is a critical component of language processing p...
research
06/04/2018

An unsupervised and customizable misspelling generator for mining noisy health-related text sources

In this paper, we present a customizable datacentric system that automat...
research
06/05/2020

Spoken dialect identification in Twitter using a multi-filter architecture

This paper presents our approach for SwissText KONVENS 2020 shared t...
research
08/26/2022

Generalizability of Code Clone Detection on CodeBERT

Transformer networks such as CodeBERT already achieve outstanding result...
research
09/10/2016

Using Spatial Pooler of Hierarchical Temporal Memory to classify noisy videos with predefined complexity

This paper examines the performance of a Spatial Pooler (SP) of a Hierar...
research
04/13/2022

IIITDWD-ShankarB@ Dravidian-CodeMixi-HASOC2021: mBERT based model for identification of offensive content in south Indian languages

In recent years, there has been a lot of focus on offensive content. The...

Please sign up or login with your details

Forgot password? Click here to reset