Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA

09/17/2020
by   Ieva Staliūnaitė, et al.
0

Many NLP tasks have benefited from transferring knowledge from contextualized word embeddings, however the picture of what type of knowledge is transferred is incomplete. This paper studies the types of linguistic phenomena accounted for by language models in the context of a Conversational Question Answering (CoQA) task. We identify the problematic areas for the finetuned RoBERTa, BERT and DistilBERT models through systematic error analysis - basic arithmetic (counting phrases), compositional semantics (negation and Semantic Role Labeling), and lexical semantics (surprisal and antonymy). When enhanced with the relevant linguistic knowledge through multitask learning, the models improve in performance. Ensembles of the enhanced models yield a boost between 2.2 and 2.7 points in F1 score overall, and up to 42.1 points in F1 on the hardest question classes. The results show differences in ability to represent compositional and lexical information between RoBERTa, BERT and DistilBERT.

READ FULL TEXT
research
10/23/2020

GiBERT: Introducing Linguistic Knowledge into BERT through a Lightweight Gated Injection Method

Large pre-trained language models such as BERT have been the driving for...
research
03/12/2021

Explaining and Improving BERT Performance on Lexical Semantic Change Detection

Type- and token-based embedding architectures are still competing in lex...
research
10/19/2021

Ensemble ALBERT on SQuAD 2.0

Machine question answering is an essential yet challenging task in natur...
research
04/29/2021

Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and Partitionability into Senses

Pre-trained language models (LMs) encode rich information about linguist...
research
10/12/2020

Probing Pretrained Language Models for Lexical Semantics

The success of large pretrained language models (LMs) such as BERT and R...
research
03/02/2021

Decomposing lexical and compositional syntax and semantics with deep language models

The activations of language transformers like GPT2 have been shown to li...
research
12/23/2019

Probing the phonetic and phonological knowledge of tones in Mandarin TTS models

This study probes the phonetic and phonological knowledge of lexical ton...

Please sign up or login with your details

Forgot password? Click here to reset