Taylor's law for Human Linguistic Sequences

04/21/2018
by   Tatsuru Kobayashi, et al.
0

Taylor's law describes the fluctuation characteristics underlying a system in which the variance of an event within a time span grows by a power law with respect to the mean. Although Taylor's law has been applied in many natural and social systems, its application for language has been scarce. This article describes Taylor analysis of over 1100 texts across 14 languages. The Taylor exponents of natural language texts exhibit almost the same value. The exponent was also compared for other language-related data, such as the CHILDES corpus, music, and programming languages. The results show how the Taylor exponent serves to quantify the fundamental structural complexity underlying linguistic time series. The article also shows the applicability of these findings in evaluating language models. Specifically, a text generated by an LSTM unit exhibited a Taylor exponent of 0.50, identical to that of an i.i.d. process, thus showing a limitation of that neural model.

READ FULL TEXT
research
06/22/2019

Evaluating Computational Language Models with Scaling Properties of Natural Language

In this article, we evaluate computational models of natural language wi...
research
12/29/2016

Verifying Heaps' law using Google Books Ngram data

This article is devoted to the verification of the empirical Heaps law i...
research
05/11/2023

Autocorrelations Decay in Texts and Applicability Limits of Language Models

We show that the laws of autocorrelations decay in texts are closely rel...
research
10/24/2018

Evolution of semantic networks in biomedical texts

Language is hierarchically organized: words are built into phrases, sent...
research
08/25/2023

On the Impact of Language Selection for Training and Evaluating Programming Language Models

The recent advancements in Transformer-based Language Models have demons...
research
04/09/2021

Heaps' Law and Vocabulary Richness in the History of Classical Music Harmony

Music is a fundamental human construct, and harmony provides the buildin...
research
01/29/2016

Zipf's law is a consequence of coherent language production

The task of text segmentation may be undertaken at many levels in text a...

Please sign up or login with your details

Forgot password? Click here to reset