SweCTRL-Mini: a data-transparent Transformer-based large language model for controllable text generation in Swedish

04/27/2023
by   Dmytro Kalpakchi, et al.
0

We present SweCTRL-Mini, a large Swedish language model that can be used for inference and fine-tuning on a single consumer-grade GPU. The model is based on the CTRL architecture by Keskar, McCann, Varshney, Xiong, and Socher (2019), which means that users of the SweCTRL-Mini model can control the genre of the generated text by inserting special tokens in the generation prompts. SweCTRL-Mini is trained on a subset of the Swedish part of the mC4 corpus and a set of Swedish novels. In this article, we provide (1) a detailed account of the utilized training data and text pre-processing steps, to the extent that it is possible to check whether a specific phrase/source was a part of the training data, and (2) an evaluation of the model on both discriminative tasks, using automatic evaluation methods, and generative tasks, using human referees. We also compare the generative capabilities of the model with those of GPT-3. SweCTRL-Mini is fully open and available for download.

READ FULL TEXT

page 10

page 11

page 12

page 18

page 19

page 20

research
09/11/2019

CTRL: A Conditional Transformer Language Model for Controllable Generation

Large-scale language models show promising text generation capabilities,...
research
03/05/2020

RecipeGPT: Generative Pre-training Based Cooking Recipe Generation and Evaluation System

Interests in the automatic generation of cooking recipes have been growi...
research
06/18/2022

Collocation2Text: Controllable Text Generation from Guide Phrases in Russian

Large pre-trained language models are capable of generating varied and f...
research
06/05/2023

Information Flow Control in Machine Learning through Modular Model Architecture

In today's machine learning (ML) models, any part of the training data c...
research
06/17/2020

Automatically Ranked Russian Paraphrase Corpus for Text Generation

The article is focused on automatic development and ranking of a large c...
research
05/24/2022

PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised Poetry Generation

Formal verse poetry imposes strict constraints on the meter and rhyme sc...
research
10/12/2020

Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data

Neural text generation (data- or text-to-text) demonstrates remarkable p...

Please sign up or login with your details

Forgot password? Click here to reset