German's Next Language Model

10/21/2020
by   Branden Chan, et al.
0

In this work we present the experiments which lead to the creation of our BERT and ELECTRA based German language models, GBERT and GELECTRA. By varying the input training data, model size, and the presence of Whole Word Masking (WWM) we were able to attain SoTA performance across a set of document classification and named entity recognition (NER) tasks for both models of base and large size. We adopt an evaluation driven approach in training these models and our results indicate that both adding more data and utilizing WWM improve model performance. By benchmarking against existing German models, we show that these models are the best German models to date. Our trained models will be made publicly available to the research community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2023

German BERT Model for Legal Named Entity Recognition

The use of BERT, one of the most popular language models, has led to imp...
research
03/14/2023

MEDBERT.de: A Comprehensive German BERT Model for the Medical Domain

This paper presents medBERTde, a pre-trained German BERT model specifica...
research
12/03/2020

GottBERT: a pure German Language Model

Lately, pre-trained language models advanced the field of natural langua...
research
06/29/2022

GERNERMED++: Transfer Learning in German Medical NLP

We present a statistical model for German medical natural language proce...
research
06/07/2023

Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs

Compared to English, German word order is freer and therefore poses addi...
research
09/06/2021

You should evaluate your language model on marginal likelihood overtokenisations

Neural language models typically tokenise input text into sub-word units...
research
05/14/2020

NAT: Noise-Aware Training for Robust Neural Sequence Labeling

Sequence labeling systems should perform reliably not only under ideal c...

Please sign up or login with your details

Forgot password? Click here to reset