DeepAI AI Chat
Log In Sign Up

Guiding Symbolic Natural Language Grammar Induction via Transformer-Based Sequence Probabilities

by   Ben Goertzel, et al.
Ben Goertzel

A novel approach to automated learning of syntactic rules governing natural languages is proposed, based on using probabilities assigned to sentences (and potentially longer word sequences) by transformer neural network language models to guide symbolic learning processes like clustering and rule induction. This method exploits the learned linguistic knowledge in transformers, without any reference to their inner representations; hence, the technique is readily adaptable to the continuous appearance of more powerful language models. We show a proof-of-concept example of our proposed technique, using it to guide unsupervised symbolic link-grammar induction methods drawn from our prior research.


page 1

page 2

page 3

page 4


Compound Probabilistic Context-Free Grammars for Grammar Induction

We study a formalization of the grammar induction problem that models se...

VLGrammar: Grounded Grammar Induction of Vision and Language

Cognitive grammar suggests that the acquisition of language grammar is g...

Physics of Language Models: Part 1, Context-Free Grammar

We design experiments to study how generative language models, like GPT,...

ImmunoLingo: Linguistics-based formalization of the antibody language

Apparent parallels between natural language and biological sequence have...

On Unsupervised Training of Link Grammar Based Language Models

In this short note we explore what is needed for the unsupervised traini...

Fine-Grained Prediction of Syntactic Typology: Discovering Latent Structure with Supervised Learning

We show how to predict the basic word-order facts of a novel language gi...

Functorial Language Models

We introduce functorial language models: a principled way to compute pro...

Code Repositories


Grammar induction leveraging transformers

view repo