A Family of Pretrained Transformer Language Models for Russian

09/19/2023
by   Dmitry Zmitrovich, et al.
0

Nowadays, Transformer language models (LMs) represent a fundamental component of the NLP research methodologies and applications. However, the development of such models specifically for the Russian language has received little attention. This paper presents a collection of 13 Russian Transformer LMs based on the encoder (ruBERT, ruRoBERTa, ruELECTRA), decoder (ruGPT-3), and encoder-decoder (ruT5, FRED-T5) models in multiple sizes. Access to these models is readily available via the HuggingFace platform. We provide a report of the model architecture design and pretraining, and the results of evaluating their generalization abilities on Russian natural language understanding and generation datasets and benchmarks. By pretraining and releasing these specialized Transformer LMs, we hope to broaden the scope of the NLP research directions and enable the development of industrial solutions for the Russian language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2021

Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress

With the advent of Transformer, which was used in translation models in ...
research
08/01/2022

Efficient Long-Text Understanding with Short-Text Models

Transformer-based pretrained language models (LMs) are ubiquitous across...
research
05/26/2023

On the Computational Power of Decoder-Only Transformer Language Models

This article presents a theoretical evaluation of the computational univ...
research
05/21/2023

A PhD Student's Perspective on Research in NLP in the Era of Very Large Language Models

Recent progress in large language models has enabled the deployment of m...
research
05/30/2023

What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization

In this paper, we conduct a comprehensive study of In-Context Learning (...
research
05/21/2023

Task-agnostic Distillation of Encoder-Decoder Language Models

Finetuning pretrained language models (LMs) have enabled appealing perfo...
research
09/13/2023

Pretraining on the Test Set Is All You Need

Inspired by recent work demonstrating the promise of smaller Transformer...

Please sign up or login with your details

Forgot password? Click here to reset