Two-stage Pipeline for Multilingual Dialect Detection

03/06/2023
by   Ankit Vaidya, et al.
0

Dialect Identification is a crucial task for localizing various Large Language Models. This paper outlines our approach to the VarDial 2023 shared task. Here we have to identify three or two dialects from three languages each which results in a 9-way classification for Track-1 and 6-way classification for Track-2 respectively. Our proposed approach consists of a two-stage system and outperforms other participants' systems and previous works in this domain. We achieve a score of 58.54 is available publicly (https://github.com/ankit-vaidya19/EACL_VarDial2023).

READ FULL TEXT
research
09/02/2021

Establishing Interlingua in Multilingual Language Models

Large multilingual language models show remarkable zero-shot cross-lingu...
research
04/01/2023

From Zero to Hero: Convincing with Extremely Complicated Math

Becoming a (super) hero is almost every kid's dream. During their shelte...
research
01/04/2023

UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical Simplification?

Previous state-of-the-art models for lexical simplification consist of c...
research
07/12/2022

Improving Domain Generalization by Learning without Forgetting: Application in Retail Checkout

Designing an automatic checkout system for retail stores at the human le...
research
03/10/2021

Team Phoenix at WASSA 2021: Emotion Analysis on News Stories with Pre-Trained Language Models

Emotion is fundamental to humanity. The ability to perceive, understand ...
research
08/23/2022

Prompting as Probing: Using Language Models for Knowledge Base Construction

Language Models (LMs) have proven to be useful in various downstream app...
research
10/24/2021

Team Enigma at ArgMining-EMNLP 2021: Leveraging Pre-trained Language Models for Key Point Matching

We present the system description for our submission towards the Key Poi...

Please sign up or login with your details

Forgot password? Click here to reset