gaBERT – an Irish Language Model

07/27/2021
by   James Barry, et al.
0

The BERT family of neural language models have become highly popular due to their ability to provide sequences of text with rich context-sensitive token encodings which are able to generalise well to many Natural Language Processing tasks. Over 120 monolingual BERT models covering over 50 languages have been released, as well as a multilingual model trained on 104 languages. We introduce, gaBERT, a monolingual BERT model for the Irish language. We compare our gaBERT model to multilingual BERT and show that gaBERT provides better representations for a downstream parsing task. We also show how different filtering criteria, vocabulary size and the choice of subword tokenisation model affect downstream performance. We release gaBERT and related code to the community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2019

Is Multilingual BERT Fluent in Language Generation?

The multilingual BERT model is trained on 104 languages and meant to ser...
research
07/22/2021

Evaluation of contextual embeddings on less-resourced languages

The current dominance of deep neural networks in natural language proces...
research
07/14/2022

Language Modelling with Pixels

Language models are defined over a finite set of inputs, which creates a...
research
08/31/2021

Monolingual versus Multilingual BERTology for Vietnamese Extractive Multi-Document Summarization

Recent researches have demonstrated that BERT shows potential in a wide ...
research
09/09/2023

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

We introduce MADLAD-400, a manually audited, general domain 3T token mon...
research
12/12/2022

Ensembling Transformers for Cross-domain Automatic Term Extraction

Automatic term extraction plays an essential role in domain language und...
research
10/09/2018

A Fast, Compact, Accurate Model for Language Identification of Codemixed Text

We address fine-grained multilingual language identification: providing ...

Please sign up or login with your details

Forgot password? Click here to reset