UzBERT: pretraining a BERT model for Uzbek

08/22/2021
by   B. Mansurov, et al.
0

Pretrained language models based on the Transformer architecture have achieved state-of-the-art results in various natural language processing tasks such as part-of-speech tagging, named entity recognition, and question answering. However, no such monolingual model for the Uzbek language is publicly available. In this paper, we introduce UzBERT, a pretrained Uzbek language model based on the BERT architecture. Our model greatly outperforms multilingual BERT on masked language model accuracy. We make the model publicly available under the MIT open-source license.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2020

EstBERT: A Pretrained Language-Specific BERT for Estonian

This paper presents EstBERT, a large pretrained transformer-based langua...
research
03/29/2021

Contextual Text Embeddings for Twi

Transformer-based language models have been changing the modern Natural ...
research
01/27/2021

KoreALBERT: Pretraining a Lite BERT Model for Korean Language Understanding

A Lite BERT (ALBERT) has been introduced to scale up deep bidirectional ...
research
04/10/2022

Breaking Character: Are Subwords Good Enough for MRLs After All?

Large pretrained language models (PLMs) typically tokenize the input str...
research
05/19/2020

Table Search Using a Deep Contextualized Language Model

Pretrained contextualized language models such as BERT have achieved imp...
research
01/31/2020

Pretrained Transformers for Simple Question Answering over Knowledge Graphs

Answering simple questions over knowledge graphs is a well-studied probl...
research
05/04/2021

HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish

BERT-based models are currently used for solving nearly all Natural Lang...

Please sign up or login with your details

Forgot password? Click here to reset