IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding

by   Bryan Wilie, et al.

Although Indonesian is known to be the fourth most frequently used language over the internet, the research progress on this language in the natural language processing (NLP) is slow-moving due to a lack of available resources. In response, we introduce the first-ever vast resource for the training, evaluating, and benchmarking on Indonesian natural language understanding (IndoNLU) tasks. IndoNLU includes twelve tasks, ranging from single sentence classification to pair-sentences sequence labeling with different levels of complexity. The datasets for the tasks lie in different domains and styles to ensure task diversity. We also provide a set of Indonesian pre-trained models (IndoBERT) trained from a large and clean Indonesian dataset Indo4B collected from publicly available sources such as social media texts, blogs, news, and websites. We release baseline models for all twelve tasks, as well as the framework for benchmark evaluation, and thus it enables everyone to benchmark their system performances.


page 1

page 2

page 3

page 4


Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets

Inspired by the success of the General Language Understanding Evaluation...

BanglaBERT: Combating Embedding Barrier for Low-Resource Language Understanding

Pre-training language models on large volume of data with self-supervise...

KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding

Natural language inference (NLI) and semantic textual similarity (STS) a...

Benchmarking Transformers-based models on French Spoken Language Understanding tasks

In the last five years, the rise of the self-attentional Transformer-bas...

FDB: Fraud Dataset Benchmark

Standardized datasets and benchmarks have spurred innovations in compute...

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

Supersized pre-trained language models have pushed the accuracy of vario...

Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models

In the last year, new neural architectures and multilingual pre-trained ...