A Multi-cascaded Deep Model for Bilingual SMS Classification

11/29/2019
by   Muhammad Haroon Shakeel, et al.
0

Most studies on text classification are focused on the English language. However, short texts such as SMS are influenced by regional languages. This makes the automatic text classification task challenging due to the multilingual, informal, and noisy nature of language in the text. In this work, we propose a novel multi-cascaded deep learning model called McM for bilingual SMS classification. McM exploits n-gram level information as well as long-term dependencies of text for learning. Our approach aims to learn a model without any code-switching indication, lexical normalization, language translation, or language transliteration. The model relies entirely upon the text as no external knowledge base is utilized for learning. For this purpose, a 12 class bilingual text dataset is developed from SMS feedbacks of citizens on public services containing mixed Roman Urdu and English languages. Our model achieves high accuracy for classification on this dataset and outperforms the previous model for multilingual text classification, highlighting language independence of McM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2021

Multilingual Text Classification for Dravidian Languages

As the fourth largest language family in the world, the Dravidian langua...
research
01/04/2020

Adapting Deep Learning for Sentiment Classification of Code-Switched Informal Short Text

Nowadays, an abundance of short text is being generated that uses nonsta...
research
10/26/2017

ALL-IN-1: Short Text Classification with One Model for All Languages

We present ALL-IN-1, a simple model for multilingual text classification...
research
05/15/2023

Taxi1500: A Multilingual Dataset for Text Classification in 1500 Languages

While natural language processing tools have been developed extensively ...
research
04/01/2020

An Improved Classification Model for Igbo Text Using N-Gram And K-Nearest Neighbour Approaches

This paper presents an improved classification model for Igbo text using...
research
03/14/2023

Optimizing Deep Learning Model Parameters with the Bees Algorithm for Improved Medical Text Classification

This paper introduces a novel mechanism to obtain the optimal parameters...
research
04/07/2020

From text saliency to linguistic objects: learning linguistic interpretable markers with a multi-channels convolutional architecture

A lot of effort is currently made to provide methods to analyze and unde...

Please sign up or login with your details

Forgot password? Click here to reset