DeepAI
Log In Sign Up

AraGPT2: Pre-Trained Transformer for Arabic Language Generation

12/31/2020
by   Wissam Antoun, et al.
0

Recently, pretrained transformer-based architectures have proven to be very efficient at language modeling and understanding, given that they are trained on a large enough corpus. Applications in language generation for Arabic is still lagging in comparison to other NLP advances primarily due to the lack of advanced Arabic language generation models. In this paper, we develop the first advanced Arabic language generation model, AraGPT2, trained from scratch on large Arabic corpora of internet text and news articles. Our largest model, AraGPT2-mega, has 1.46 billion parameters, which makes it the largest Arabic language model available. We evaluate different size variants of AraGPT2 using the perplexity measure, where AraGPT2-mega achieves a perplexity of 29.8 on held-out articles from Wikipedia. Pretrained variants of AraGPT2 (base, medium, large, mega) are publicly available on https://github.com/aub-mind/arabert/aragpt2 hoping to encourage new research directions and applications for Arabic NLP.

READ FULL TEXT

page 1

page 2

page 3

page 4

02/28/2020

AraBERT: Transformer-based Model for Arabic Language Understanding

The Arabic language is a morphologically rich and complex language with ...
03/11/2021

The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models

In this paper, we explore the effects of language variants, data sizes, ...
03/21/2022

AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive Summarization

Like most natural language understanding and generation tasks, state-of-...
12/31/2020

AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding

Advances in English language representation enabled a more sample-effici...
12/21/2022

ORCA: A Challenging Benchmark for Arabic Language Understanding

Due to their crucial role in all NLP, several benchmarks have been propo...
11/07/2022

CELLS: A Parallel Corpus for Biomedical Lay Language Generation

Recent lay language generation systems have used Transformer models trai...
07/05/2021

DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling

Rap generation, which aims to produce lyrics and corresponding singing b...