A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English

11/02/2020
by   Marius Mosbach, et al.
0

Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on. We evaluate three models (BERT, RoBERTa, and ALBERT), testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks. We focus on relative clauses (in American English) as a complex phenomenon needing contextual information and antecedent identification to be resolved. Based on a naturalistic dataset, probing shows that all three models indeed capture linguistic knowledge about grammaticality, achieving high performance. Evaluation on diagnostic cases and masked prediction tasks considering fine-grained linguistic knowledge, however, shows pronounced model-specific weaknesses especially on semantic knowledge, strongly impacting models' performance. Our results highlight the importance of (a)model comparison in evaluation task and (b) building up claims of model performance and the linguistic knowledge they capture beyond purely probing-based evaluations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

Evaluation of African American Language Bias in Natural Language Generation

We evaluate how well LLMs understand African American Language (AAL) in ...
research
04/15/2018

Context and Humor: Understanding Amul advertisements of India

Contextual knowledge is the most important element in understanding lang...
research
10/23/2022

RuCoLA: Russian Corpus of Linguistic Acceptability

Linguistic acceptability (LA) attracts the attention of the research com...
research
09/05/2019

Investigating BERT's Knowledge of Language: Five Analysis Methods with NPIs

Though state-of-the-art sentence representation models can perform tasks...
research
11/02/2020

The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks

Contextual embeddings derived from transformer-based neural language mod...
research
12/13/2022

A fine-grained comparison of pragmatic language understanding in humans and language models

Pragmatics is an essential part of communication, but it remains unclear...
research
05/22/2023

DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules

Existing large language models (LLMs) that mainly focus on Standard Amer...

Please sign up or login with your details

Forgot password? Click here to reset