What's so special about BERT's layers? A closer look at the NLP pipeline in monolingual and multilingual models

04/14/2020
by   Wietse de Vries, et al.
0

Experiments with transfer learning on pre-trained language models such as BERT have shown that the layers of these models resemble the classical NLP pipeline, with progressively more complex tasks being concentrated in later layers of the network. We investigate to what extent these results also hold for a language other than English. For this we probe a Dutch BERT-based model and the multilingual BERT model for Dutch NLP tasks. In addition, by considering the task of part-of-speech tagging in more detail, we show that also within a given task, information is spread over different parts of the network and the pipeline might not be as neat as it seems. Each layer has different specialisations and it is therefore useful to combine information from different layers for best results, instead of selecting a single layer based on the best overall performance.

READ FULL TEXT

page 11

page 13

research
06/14/2020

FinEst BERT and CroSloEngual BERT: less is more in multilingual models

Large pretrained masked language models have become state-of-the-art sol...
research
05/15/2019

BERT Rediscovers the Classical NLP Pipeline

Pre-trained text encoders have rapidly advanced the state of the art on ...
research
04/30/2020

Investigating Transferability in Pretrained Language Models

While probing is a common technique for identifying knowledge in the rep...
research
05/06/2021

Adapting Monolingual Models: Data can be Scarce when Language Similarity is High

For many (minority) languages, the resources needed to train large model...
research
09/20/2021

Model Bias in NLP – Application to Hate Speech Classification

This document sums up our results forthe NLP lecture at ETH in the sprin...
research
03/29/2020

User Generated Data: Achilles' heel of BERT

Pre-trained language models such as BERT are known to perform exceedingl...
research
03/01/2022

BERT-LID: Leveraging BERT to Improve Spoken Language Identification

Language identification is a task of automatically determining the ident...

Please sign up or login with your details

Forgot password? Click here to reset