Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand?

by   William Merrill, et al.

Language models trained on billions of tokens have recently led to unprecedented results on many NLP tasks. This success raises the question of whether, in principle, a system can ever "understand" raw text without access to some form of grounding. We formally investigate the abilities of ungrounded systems to acquire meaning. Our analysis focuses on the role of "assertions": contexts within raw text that provide indirect clues about underlying semantics. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation if all expressions in the language are referentially transparent. However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem. Finally, we discuss differences between our formal model and natural language, exploring how our results generalize to a modal setting and other semantic relations. Together, our results suggest that assertions in code or language do not provide sufficient signal to fully emulate semantic representations. We formalize ways in which ungrounded language models appear to be fundamentally limited in their ability to "understand".


page 1

page 2

page 3

page 4


Constrained Language Models Yield Few-Shot Semantic Parsers

We explore the use of large pretrained language models as few-shot seman...

Text analysis and deep learning: A network approach

Much information available to applied researchers is contained within wr...

Few-Shot Semantic Parsing with Language Models Trained On Code

Large language models, prompted with in-context examples, can perform se...

Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis

Representational Similarity Analysis is a method from cognitive neurosci...

Predicting metrical patterns in Spanish poetry with language models

In this paper, we compare automated metrical pattern identification syst...

Is the Computation of Abstract Sameness Relations Human-Like in Neural Language Models?

In recent years, deep neural language models have made strong progress i...

Meaning without reference in large language models

The widespread success of large language models (LLMs) has been met with...