idT5: Indonesian Version of Multilingual T5 Transformer

02/02/2023
by   Mukhlish Fuadi, et al.
0

Indonesian language is spoken by almost 200 million people and is the 10th most spoken language in the world, but it is under-represented in NLP (Natural Language Processing) research. A sparsity of language resources has hampered previous work on Indonesian. The Transformer is a new architecture rapidly becoming dominant for NLP, surpassing alternatives like convolutional and recurrent neural networks. T5 (Text-to-Text Transfer Transformer) is a Transformer model that converts all text-based language problems to text-to-text format for English. The multilingual variant is mT5 (multilingual T5) which has shown promising results on many NLP tasks across languages. However, the size of this multilingual model is a drawback for its application in real production applications, which sometimes require only one language. In this study, the mT5 model was adapted for only one language, Indonesian, resulting in a pre-trained T5 model that was specific only for Indonesian with a smaller size. For performance comparison, we fine-tuned this model and the mT5 model to the Sentiment Analysis (SA), Question Generation (QG), and Question Answering (QA) tasks with the exact mechanism and dataset. Fine-tuned model based on our model achieved 77.18 mT5-based model, and obtained nearly the same score as the mT5-based model on QG and QA. The results confirm that it is possible to produce a smaller pre-trained model that maintains comparable yields while reducing the model size by up to 58 faster, and inference times faster.

READ FULL TEXT
research
04/03/2020

Testing pre-trained Transformer models for Lithuanian news clustering

A recent introduction of Transformer deep learning architecture made bre...
research
10/22/2020

mT5: A massively multilingual pre-trained text-to-text transformer

The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified ...
research
11/02/2020

IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP

Although the Indonesian language is spoken by almost 200 million people ...
research
08/30/2021

The effects of data size on Automated Essay Scoring engines

We study the effects of data size and quality on the performance on Auto...
research
05/04/2023

Gpt-4: A Review on Advancements and Opportunities in Natural Language Processing

Generative Pre-trained Transformer 4 (GPT-4) is the fourth-generation la...
research
06/02/2021

On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers

How much information do NLP tasks really need from a transformer's atten...
research
07/06/2023

Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M

In this study, the performances of the Whisper-Small and Wav2Vec2-XLS-R-...

Please sign up or login with your details

Forgot password? Click here to reset