Stochastic LLMs do not Understand Language: Towards Symbolic, Explainable and Ontologically Based LLMs

09/12/2023
by   Walid S. Saba, et al.
0

In our opinion the exuberance surrounding the relative success of data-driven large language models (LLMs) is slightly misguided and for several reasons (i) LLMs cannot be relied upon for factual information since for LLMs all ingested text (factual or non-factual) was created equal; (ii) due to their subsymbolic na-ture, whatever 'knowledge' these models acquire about language will always be buried in billions of microfeatures (weights), none of which is meaningful on its own; and (iii) LLMs will often fail to make the correct inferences in several linguistic contexts (e.g., nominal compounds, copredication, quantifier scope ambi-guities, intensional contexts. Since we believe the relative success of data-driven large language models (LLMs) is not a reflection on the symbolic vs. subsymbol-ic debate but a reflection on applying the successful strategy of a bottom-up reverse engineering of language at scale, we suggest in this paper applying the effective bottom-up strategy in a symbolic setting resulting in symbolic, explainable, and ontologically grounded language models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2023

Symbolic and Language Agnostic Large Language Models

We argue that the relative success of large language models (LLMs) is no...
research
05/30/2023

Towards Explainable and Language-Agnostic LLMs: Symbolic Reverse Engineering of Language at Scale

Large language models (LLMs) have achieved a milestone that undenia-bly ...
research
05/26/2023

MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies

Autoregressive language models are trained by minimizing the cross-entro...
research
06/26/2023

Data-Driven Approach for Formality-Sensitive Machine Translation: Language-Specific Handling and Synthetic Data Generation

In this paper, we introduce a data-driven approach for Formality-Sensiti...
research
04/03/2023

Can the Inference Logic of Large Language Models be Disentangled into Symbolic Concepts?

In this paper, we explain the inference logic of large language models (...
research
08/08/2023

DataTales: Investigating the use of Large Language Models for Authoring Data-Driven Articles

Authoring data-driven articles is a complex process requiring authors to...
research
04/04/2021

A Context-Dependent Gated Module for Incorporating Symbolic Semantics into Event Coreference Resolution

Event coreference resolution is an important research problem with many ...

Please sign up or login with your details

Forgot password? Click here to reset