Cedille: A large autoregressive French language model

02/07/2022
by   Martin Müller, et al.
17

Scaling up the size and training of autoregressive language models has enabled novel ways of solving Natural Language Processing tasks using zero-shot and few-shot learning. While extreme-scale language models such as GPT-3 offer multilingual capabilities, zero-shot learning for languages other than English remain largely unexplored. Here, we introduce Cedille, a large open source auto-regressive language model, specifically trained for the French language. Our results show that Cedille outperforms existing French language models and is competitive with GPT-3 on a range of French zero-shot benchmarks. Furthermore, we provide an in-depth comparison of the toxicity exhibited by these models, showing that Cedille marks an improvement in language model safety thanks to dataset filtering.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2021

Few-shot Learning with Multilingual Language Models

Large-scale autoregressive language models such as GPT-3 are few-shot le...
research
10/19/2022

TabLLM: Few-shot Classification of Tabular Data with Large Language Models

We study the application of large language models to zero-shot and few-s...
research
05/17/2023

Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM's Translation Capability

Large, multilingual language models exhibit surprisingly good zero- or f...
research
12/21/2022

SERENGETI: Massively Multilingual Language Models for Africa

Multilingual language models (MLMs) acquire valuable, generalizable ling...
research
12/13/2021

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Scaling language models with more data, compute and parameters has drive...
research
10/21/2022

Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination

Large-scale pretrained language models have made significant advances in...
research
01/28/2022

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model

Pretrained general-purpose language models can achieve state-of-the-art ...

Please sign up or login with your details

Forgot password? Click here to reset