Deep Learning for Hindi Text Classification: A Comparison

01/19/2020
by   Ramchandra Joshi, et al.
0

Natural Language Processing (NLP) and especially natural language text analysis have seen great advances in recent times. Usage of deep learning in text processing has revolutionized the techniques for text processing and achieved remarkable results. Different deep learning architectures like CNN, LSTM, and very recent Transformer have been used to achieve state of the art results variety on NLP tasks. In this work, we survey a host of deep learning architectures for text classification tasks. The work is specifically concerned with the classification of Hindi text. The research in the classification of morphologically rich and low resource Hindi language written in Devanagari script has been limited due to the absence of large labeled corpus. In this work, we used translated versions of English data-sets to evaluate models based on CNN, LSTM and Attention. Multilingual pre-trained sentence embeddings based on BERT and LASER are also compared to evaluate their effectiveness for the Hindi language. The paper also serves as a tutorial for popular text classification techniques.

READ FULL TEXT
research
01/13/2021

Experimental Evaluation of Deep Learning models for Marathi Text Classification

The Marathi language is one of the prominent languages used in India. It...
research
11/09/2020

Bangla Text Classification using Transformers

Text classification has been one of the earliest problems in NLP. Over t...
research
10/16/2019

Evolution of transfer learning in natural language processing

In this paper, we present a study of the recent advancements which have ...
research
08/03/2023

Tag Prediction of Competitive Programming Problems using Deep Learning Techniques

In the past decade, the amount of research being done in the fields of m...
research
05/08/2020

Comparative Analysis of Text Classification Approaches in Electronic Health Records

Text classification tasks which aim at harvesting and/or organizing info...
research
12/01/2022

Embedding generation for text classification of Brazilian Portuguese user reviews: from bag-of-words to transformers

Text classification is a natural language processing (NLP) task relevant...
research
05/02/2022

Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language

Graph Convolutional Networks (GCN) have achieved state-of-art results on...

Please sign up or login with your details

Forgot password? Click here to reset