Challenges and Considerations with Code-Mixed NLP for Multilingual Societies

06/15/2021
by   Vivek Srivastava, et al.
6

Multilingualism refers to the high degree of proficiency in two or more languages in the written and oral communication modes. It often results in language mixing, a.k.a. code-mixing, when a multilingual speaker switches between multiple languages in a single utterance of a text or speech. This paper discusses the current state of the NLP research, limitations, and foreseeable pitfalls in addressing five real-world applications for social good crisis management, healthcare, political campaigning, fake news, and hate speech for multilingual societies. We also propose futuristic datasets, models, and tools that can significantly advance the current research in multilingual NLP applications for the societal good. As a representative example, we consider English-Hindi code-mixing but draw similar inferences for other language pairs

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2021

Challenges and Limitations with the Metrics Measuring the Complexity of Code-Mixed Text

Code-mixing is a frequent communication style among multilingual speaker...
research
03/23/2023

Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages

While code-mixing is a common linguistic practice in many parts of the w...
research
02/23/2023

MUTANT: A Multi-sentential Code-mixed Hinglish Dataset

The multi-sentential long sequence textual data unfolds several interest...
research
06/16/2022

PreCogIIITH at HinglishEval : Leveraging Code-Mixing Metrics Language Model Embeddings To Estimate Code-Mix Quality

Code-Mixing is a phenomenon of mixing two or more languages in a speech ...
research
01/27/2022

Prabhupadavani: A Code-mixed Speech Translation Data for 25 Languages

Nowadays, code-mixing has become ubiquitous in Natural Language Processi...
research
06/10/2021

CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing

The NLP community has witnessed steep progress in a variety of tasks acr...
research
05/23/2023

Making the Implicit Explicit: Implicit Content as a First Class Citizen in NLP

Language is multifaceted. A given utterance can be re-expressed in equiv...

Please sign up or login with your details

Forgot password? Click here to reset