Recurrent-Neural-Network for Language Detection on Twitter Code-Switching Corpus

12/14/2014
by   Joseph Chee Chang, et al.
0

Mixed language data is one of the difficult yet less explored domains of natural language processing. Most research in fields like machine translation or sentiment analysis assume monolingual input. However, people who are capable of using more than one language often communicate using multiple languages at the same time. Sociolinguists believe this "code-switching" phenomenon to be socially motivated. For example, to express solidarity or to establish authority. Most past work depend on external tools or resources, such as part-of-speech tagging, dictionary look-up, or named-entity recognizers to extract rich features for training machine learning models. In this paper, we train recurrent neural networks with only raw features, and use word embedding to automatically learn meaningful representations. Using the same mixed-language Twitter corpus, our system is able to outperform the best SVM-based systems reported in the EMNLP'14 Code-Switching Workshop by 1 accuracy, or by 17

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2018

Preparing Bengali-English Code-Mixed Corpus for Sentiment Analysis of Indian Languages

Analysis of informative contents and sentiments of social users has been...
research
04/26/2020

GLUECoS : An Evaluation Benchmark for Code-Switched NLP

Code-switching is the use of more than one language in the same conversa...
research
09/11/2019

From English to Code-Switching: Transfer Learning with Strong Morphological Clues

Code-switching is still an understudied phenomenon in natural language p...
research
09/07/2020

NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching language using a simple deep-learning classifier

Code-switching is a phenomenon in which two or more languages are used i...
research
04/03/2019

Subword-Level Language Identification for Intra-Word Code-Switching

Language identification for code-switching (CS), the phenomenon of alter...
research
12/19/2022

The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges

Code-Switching, a common phenomenon in written text and conversation, ha...
research
11/29/2019

Sentiment Analysis of German Twitter

This thesis explores the ways by how people express their opinions on Ge...

Please sign up or login with your details

Forgot password? Click here to reset