Multilingual is not enough: BERT for Finnish

12/15/2019
by   Antti Virtanen, et al.
0

Deep learning-based language models pretrained on large unannotated text corpora have been demonstrated to allow efficient transfer learning for natural language processing, with recent approaches such as the transformer-based BERT model advancing the state of the art across a variety of tasks. While most work on these models has focused on high-resource languages, in particular English, a number of recent efforts have introduced multilingual models that can be fine-tuned to address tasks in a large number of different languages. However, we still lack a thorough understanding of the capabilities of these models, in particular for lower-resourced languages. In this paper, we focus on Finnish and thoroughly evaluate the multilingual BERT model on a range of tasks, comparing it with a new Finnish BERT model trained from scratch. The new language-specific model is shown to systematically and clearly outperform the multilingual. While the multilingual model largely fails to reach the performance of previously proposed methods, the custom Finnish BERT model establishes new state-of-the-art results on all corpora for all reference tasks: part-of-speech tagging, named entity recognition, and dependency parsing. We release the model and all related resources created for this study with open licenses at https://turkunlp.org/finbert .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2020

EstBERT: A Pretrained Language-Specific BERT for Estonian

This paper presents EstBERT, a large pretrained transformer-based langua...
research
07/31/2023

Classifying multilingual party manifestos: Domain transfer across country, time, and genre

Annotating costs of large corpora are still one of the main bottlenecks ...
research
03/29/2021

Contextual Text Embeddings for Twi

Transformer-based language models have been changing the modern Natural ...
research
12/01/2015

Multilingual Language Processing From Bytes

We describe an LSTM-based model which we call Byte-to-Span (BTS) that re...
research
06/02/2020

WikiBERT models: deep transfer learning for many languages

Deep neural language models such as BERT have enabled substantial recent...
research
07/03/2020

Playing with Words at the National Library of Sweden – Making a Swedish BERT

This paper introduces the Swedish BERT ("KB-BERT") developed by the KBLa...
research
12/23/2021

Distilling the Knowledge of Romanian BERTs Using Multiple Teachers

Running large-scale pre-trained language models in computationally const...

Please sign up or login with your details

Forgot password? Click here to reset