Larger Probes Tell a Different Story: Extending Psycholinguistic Datasets Via In-Context Learning

03/29/2023
by   Namrata Shivagunde, et al.
0

Language model probing is often used to test specific capabilities of these models. However, conclusions from such studies may be limited when the probing benchmarks are small and lack statistical power. In this work, we introduce new, larger datasets for negation (NEG-1500-SIMP) and role reversal (ROLE-1500) inspired by psycholinguistic studies. We dramatically extend existing NEG-136 and ROLE-88 benchmarks using GPT3, increasing their size from 18 and 44 sentence pairs to 750 each. We also create another version of extended negation dataset (NEG-1500-SIMP-TEMP), created using template-based generation. It consists of 770 sentence pairs. We evaluate 22 models on the extended datasets, seeing model performance dip 20-57 benchmarks. We observe high levels of negation sensitivity in models like BERT and ALBERT demonstrating that previous findings might have been skewed due to smaller test sets. Finally, we observe that while GPT3 has generated all the examples in ROLE-1500 is only able to solve 24.6

READ FULL TEXT

page 9

page 10

page 11

research
07/05/2020

CORD19STS: COVID-19 Semantic Textual Similarity Dataset

In order to combat the COVID-19 pandemic, society can benefit from vario...
research
05/11/2020

Toward Better Storylines with Sentence-Level Language Models

We propose a sentence-level language model which selects the next senten...
research
10/25/2021

Identifying and Benchmarking Natural Out-of-Context Prediction Problems

Deep learning systems frequently fail at out-of-context (OOC) prediction...
research
09/14/2023

Anchor Points: Benchmarking Models with Much Fewer Examples

Modern language models often exhibit powerful but brittle behavior, lead...
research
11/03/2020

Finding Friends and Flipping Frenemies: Automatic Paraphrase Dataset Augmentation Using Graph Theory

Most NLP datasets are manually labeled, so suffer from inconsistent labe...
research
07/28/2022

SDBERT: SparseDistilBERT, a faster and smaller BERT model

In this work we introduce a new transformer architecture called SparseDi...
research
04/06/2022

How Do Graph Networks Generalize to Large and Diverse Molecular Systems?

The predominant method of demonstrating progress of atomic graph neural ...

Please sign up or login with your details

Forgot password? Click here to reset