Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts

11/26/2022
by   Atnafu Lambebo Tonja, et al.
0

Using code-mixed data in natural language processing (NLP) research currently gets a lot of attention. Language identification of social media code-mixed text has been an interesting problem of study in recent years due to the advancement and influences of social media in communication. This paper presents the Instituto Politécnico Nacional, Centro de Investigación en Computación (CIC) team's system description paper for the CoLI-Kanglish shared task at ICON2022. In this paper, we propose the use of a Transformer based model for word-level language identification in code-mixed Kannada English texts. The proposed model on the CoLI-Kenglish dataset achieves a weighted F1-score of 0.84 and a macro F1-score of 0.61.

READ FULL TEXT
research
12/23/2016

A CRF Based POS Tagger for Code-mixed Indian Social Media Text

In this work, we describe a conditional random fields (CRF) based system...
research
11/17/2022

CoLI-Machine Learning Approaches for Code-mixed Language Identification at the Word Level in Kannada-English Texts

The task of automatically identifying a language used in a given text is...
research
04/13/2022

IIITDWD-ShankarB@ Dravidian-CodeMixi-HASOC2021: mBERT based model for identification of offensive content in south Indian languages

In recent years, there has been a lot of focus on offensive content. The...
research
09/22/2020

Ghmerti at SemEval-2019 Task 6: A Deep Word- and Character-based Approach to Offensive Language Identification

This paper presents the models submitted by Ghmerti team for subtasks A ...
research
10/09/2020

Word Level Language Identification in English Telugu Code Mixed Data

In a multilingual or sociolingual configuration Intra-sentential Code Sw...
research
08/02/2023

UPB at IberLEF-2023 AuTexTification: Detection of Machine-Generated Text using Transformer Ensembles

This paper describes the solutions submitted by the UPB team to the AuTe...
research
12/31/2021

Hypers at ComMA@ICON: Modelling Aggressiveness, Gender Bias and Communal Bias Identification

Due to the exponentially increasing reach of social media, it is essenti...

Please sign up or login with your details

Forgot password? Click here to reset