There Are Fewer Facts Than Words: Communication With A Growing Complexity

11/02/2022
by   Łukasz Dębowski, et al.
0

We present an impossibility result, called a theorem about facts and words, which pertains to a general communication system. The theorem states that the number of distinct words used in a finite text is roughly greater than the number of independent elementary persistent facts described in the same text. In particular, this theorem can be related to Zipf's law, power-law scaling of mutual information, and power-law-tailed learning curves. The assumptions of the theorem are: a finite alphabet, linear sequence of symbols, complexity that does not decrease in time, entropy rate that can be estimated, and finiteness of the inverse complexity rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2017

Is Natural Language a Perigraphic Process? The Theorem about Facts and Words Revisited

As we discuss, a stationary stochastic process is nonergodic when a rand...
research
04/25/2018

Nyldon words

The Chen-Fox-Lyndon theorem states that every finite word over a fixed a...
research
03/11/2021

Tensor networks and efficient descriptions of classical data

We investigate the potential of tensor network based machine learning me...
research
03/20/2023

Infinite Words and Morphic Languages Formalized in Isabelle/HOL

We present a formalization of basics related to infinite words in the ge...
research
06/21/2016

Criticality in Formal Languages and Statistical Physics

We show that the mutual information between two symbols, as a function o...
research
01/07/2020

Heaps' law and Heaps functions in tagged texts: Evidences of their linguistic relevance

We study the relationship between vocabulary size and text length in a c...
research
09/22/2018

Relating Zipf's law to textual information

Zipf's law is the main regularity of quantitative linguistics. Despite o...

Please sign up or login with your details

Forgot password? Click here to reset