GPT-SW3: An Autoregressive Language Model for the Nordic Languages

05/22/2023
by   Ariel Ekgren, et al.
0

This paper details the process of developing the first native large generative language model for the Nordic languages, GPT-SW3. We cover all parts of the development process, from data collection and processing, training configuration and instruction finetuning, to evaluation and considerations for release strategies. We hope that this paper can serve as a guide and reference for other researchers that undertake the development of large generative models for smaller languages.

READ FULL TEXT
research
03/30/2023

The Nordic Pile: A 1.2TB Nordic Dataset for Language Modeling

Pre-training Large Language Models (LLMs) require massive amounts of tex...
research
04/20/2023

Phoenix: Democratizing ChatGPT across Languages

This paper presents our efforts to democratize ChatGPT across language. ...
research
04/21/2021

Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?

Most work in NLP makes the assumption that it is desirable to develop so...
research
04/11/2022

Adapting BigScience Multilingual Model to Unseen Languages

We benchmark different strategies of adding new languages (German and Ko...
research
08/04/2018

Language Model Supervision for Handwriting Recognition Model Adaptation

Training state-of-the-art offline handwriting recognition (HWR) models r...
research
08/16/2023

FootGPT : A Large Language Model Development Experiment on a Minimal Setting

With recent empirical observations, it has been argued that the most sig...
research
09/04/2018

Random Language Model: a path to principled complexity

Many complex generative systems use languages to create structured objec...

Please sign up or login with your details

Forgot password? Click here to reset