How does the pre-training objective affect what large language models learn about linguistic properties?

03/20/2022
by   Ahmed Alajrami, et al.
0

Several pre-training objectives, such as masked language modeling (MLM), have been proposed to pre-train language models (e.g. BERT) with the aim of learning better language representations. However, to the best of our knowledge, no previous work so far has investigated how different pre-training objectives affect what BERT learns about linguistics properties. We hypothesize that linguistically motivated objectives such as MLM should help BERT to acquire better linguistic knowledge compared to other non-linguistically motivated objectives that are not intuitive or hard for humans to guess the association between the input and the label to be predicted. To this end, we pre-train BERT with two linguistically motivated objectives and three non-linguistically motivated ones. We then probe for linguistic characteristics encoded in the representation of the resulting models. We find strong evidence that there are only small differences in probing performance between the representations learned by the two different types of objectives. These surprising results question the dominant narrative of linguistically informed pre-training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/19/2022

Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning

Multiple pre-training objectives fill the vacancy of the understanding c...
research
12/16/2021

Does Pre-training Induce Systematic Inference? How Masked Language Models Acquire Commonsense Knowledge

Transformer models pre-trained with a masked-language-modeling objective...
research
07/31/2019

What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models

Pre-training by language modeling has become a popular and successful ap...
research
05/09/2022

Improving negation detection with negation-focused pre-training

Negation is a common linguistic feature that is crucial in many language...
research
07/04/2022

Probing via Prompting

Probing is a popular method to discern what linguistic information is co...
research
09/13/2021

Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids' Representations

Most of the recent works on probing representations have focused on BERT...
research
01/05/2023

GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods

A key goal for the advancement of AI is to develop technologies that ser...

Please sign up or login with your details

Forgot password? Click here to reset