Prompt-based methods may underestimate large language models' linguistic generalizations

05/22/2023
by   Jennifer Hu, et al.
0

Prompting is now a dominant method for evaluating the linguistic knowledge of large language models (LLMs). While other methods directly read out models' probability distributions over strings, prompting requires models to access this internal information by processing linguistic input, thereby implicitly testing a new type of emergent ability: metalinguistic judgment. In this study, we compare metalinguistic prompting and direct probability measurements as ways of measuring models' knowledge of English. Broadly, we find that LLMs' metalinguistic judgments are inferior to quantities directly derived from representations. Furthermore, consistency gets worse as the prompt diverges from direct measurements of next-word probabilities. Our findings suggest that negative results relying on metalinguistic prompts cannot be taken as conclusive evidence that an LLM lacks a particular linguistic competence. Our results also highlight the lost value with the move to closed APIs where access to probability distributions is limited.

READ FULL TEXT

page 7

page 14

page 15

research
12/15/2022

Joint processing of linguistic properties in brains and language models

Language models have been shown to be very effective in predicting brain...
research
09/08/2021

Transformers in the loop: Polarity in neural models of language

Representation of linguistic phenomena in computational language models ...
research
06/04/2019

Blackbox meets blackbox: Representational Similarity and Stability Analysis of Neural Language Models and Brains

In this paper, we define and apply representational stability analysis (...
research
03/02/2021

The Rediscovery Hypothesis: Language Models Need to Meet Linguistics

There is an ongoing debate in the NLP community whether modern language ...
research
07/01/2016

Throwing fuel on the embers: Probability or Dichotomy, Cognitive or Linguistic?

Prof. Robert Berwick's abstract for his forthcoming invited talk at the ...
research
01/24/2021

Evaluating Models of Robust Word Recognition with Serial Reproduction

Spoken communication occurs in a "noisy channel" characterized by high l...
research
06/03/2021

Provably Secure Generative Linguistic Steganography

Generative linguistic steganography mainly utilized language models and ...

Please sign up or login with your details

Forgot password? Click here to reset