Language Identification of Bengali-English Code-Mixed data using Character & Phonetic based LSTM Models

03/10/2018
by   Soumil Mandal, et al.
0

Language identification of social media text still remains a challenging task due to properties like code-mixing and inconsistent phonetic transliterations. In this paper, we present a supervised learning approach for language identification at the word level of low resource Bengali-English code-mixed data taken from social media. We employ two methods of word encoding, namely character based and root phone based to train our deep LSTM models. Utilizing these two models we created two ensemble models using stacking and threshold technique which gave 91.78 data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2020

Evaluating Input Representation for Language Identification in Hindi-English Code Mixed Text

Natural language processing (NLP) techniques have become mainstream in t...
research
08/10/2016

Hierarchical Character-Word Models for Language Identification

Social media messages' brevity and unconventional spelling pose a challe...
research
10/16/2018

Strategies for Language Identification in Code-Mixed Low Resource Languages

In the recent years, substantial work has been done on language tagging ...
research
06/11/2018

Automatic Target Recovery for Hindi-English Code Mixed Puns

In order for our computer systems to be more human-like, with a higher e...
research
08/21/2018

Language Identification in Code-Mixed Data using Multichannel Neural Networks and Context Capture

An accurate language identification tool is an absolute necessity for bu...
research
06/12/2018

An Ensemble Model for Sentiment Analysis of Hindi-English Code-Mixed Data

In multilingual societies like India, code-mixed social media texts comp...
research
05/22/2018

Normalization of Transliterated Words in Code-Mixed Data Using Seq2Seq Model & Levenshtein Distance

Building tools for code-mixed data is rapidly gaining popularity in the ...

Please sign up or login with your details

Forgot password? Click here to reset