Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages

11/04/2020
by   Kushal Jain, et al.
0

Language models based on the Transformer architecture have achieved state-of-the-art performance on a wide range of NLP tasks such as text classification, question-answering, and token classification. However, this performance is usually tested and reported on high-resource languages, like English, French, Spanish, and German. Indian languages, on the other hand, are underrepresented in such benchmarks. Despite some Indian languages being included in training multilingual Transformer models, they have not been the primary focus of such work. In order to evaluate the performance on Indian languages specifically, we analyze these language models through extensive experiments on multiple downstream tasks in Hindi, Bengali, and Telugu language. Here, we compare the efficacy of fine-tuning model parameters of pre-trained models against that of training a language model from scratch. Moreover, we empirically argue against the strict dependency between the dataset size and model performance, but rather encourage task-specific model and method selection. We achieve state-of-the-art performance on Hindi and Bengali languages for text classification task. Finally, we present effective strategies for handling the modeling of Indian languages and we release our model checkpoints for the community : https://huggingface.co/neuralspace-reverie.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2021

HinFlair: pre-trained contextual string embeddings for pos tagging and text classification in the Hindi language

Recent advancements in language models based on recurrent neural network...
research
03/25/2022

GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models

Deep learning (DL) techniques involving fine-tuning large numbers of mod...
research
11/25/2021

TunBERT: Pretrained Contextualized Text Representation for Tunisian Dialect

Pretrained contextualized text representation models learn an effective ...
research
07/06/2023

Text Alignment Is An Efficient Unified Model for Massive NLP Tasks

Large language models (LLMs), typically designed as a function of next-w...
research
04/29/2020

Evaluating the Role of Language Typology in Transformer-Based Multilingual Text Classification

As NLP tools become ubiquitous in today's technological landscape, they ...
research
04/29/2020

Evaluating Transformer-Based Multilingual Text Classification

As NLP tools become ubiquitous in today's technological landscape, they ...
research
01/22/2023

SPEC5G: A Dataset for 5G Cellular Network Protocol Analysis

5G is the 5th generation cellular network protocol. It is the state-of-t...

Please sign up or login with your details

Forgot password? Click here to reset