Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand?

by   William Merrill, et al.

Language models trained on billions of tokens have recently led to unprecedented results on many NLP tasks. This success raises the question of whether, in principle, a system can ever "understand" raw text without access to some form of grounding. We formally investigate the abilities of ungrounded systems to acquire meaning. Our analysis focuses on the role of "assertions": contexts within raw text that provide indirect clues about underlying semantics. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation if all expressions in the language are referentially transparent. However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem. Finally, we discuss differences between our formal model and natural language, exploring how our results generalize to a modal setting and other semantic relations. Together, our results suggest that assertions in code or language do not provide sufficient signal to fully emulate semantic representations. We formalize ways in which ungrounded language models appear to be fundamentally limited in their ability to "understand".


page 1

page 2

page 3

page 4


Transparency Helps Reveal When Language Models Learn Meaning

Many current NLP systems are built from language models trained to optim...

Text analysis and deep learning: A network approach

Much information available to applied researchers is contained within wr...

Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis

Representational Similarity Analysis is a method from cognitive neurosci...

Predicting metrical patterns in Spanish poetry with language models

In this paper, we compare automated metrical pattern identification syst...

SyGNS: A Systematic Generalization Testbed Based on Natural Language Semantics

Recently, deep neural networks (DNNs) have achieved great success in sem...

Is the Computation of Abstract Sameness Relations Human-Like in Neural Language Models?

In recent years, deep neural language models have made strong progress i...

Text-based NP Enrichment

Understanding the relations between entities denoted by NPs in text is a...

Please sign up or login with your details

Forgot password? Click here to reset