A Comparison of Two Fluctuation Analyses for Natural Language Clustering Phenomena: Taylor and Ebeling Neiman Methods

09/14/2020
by   Kumiko Tanaka-Ishii, et al.
0

This article considers the fluctuation analysis methods of Taylor and Ebeling Neiman. While both have been applied to various phenomena in the statistical mechanics domain, their similarities and differences have not been clarified. After considering their analytical aspects, this article presents a large-scale application of these methods to text. It is found that both methods can distinguish real text from independently and identically distributed (i.i.d.) sequences. Furthermore, it is found that the Taylor exponents acquired from words can roughly distinguish text categories; this is also the case for Ebeling and Neiman exponents, but to a lesser extent. Additionally, both methods show some possibility of capturing script kinds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/28/2018

Mapping Natural Language Commands to Web Elements

The web provides a rich, open-domain environment with textual, structura...
research
06/14/2018

Automatic Language Identification for Romance Languages using Stop Words and Diacritics

Automatic language identification is a natural language processing probl...
research
11/03/2019

Low-dimensional Semantic Space: from Text to Word Embedding

This article focuses on the study of Word Embedding, a feature-learning ...
research
07/06/2023

Statistical Mechanics of Strahler Number via Random and Natural Language Sentences

The Strahler number was originally proposed to characterize the complexi...
research
04/09/2020

Two halves of a meaningful text are statistically different

Which statistical features distinguish a meaningful text (possibly writt...
research
12/18/2022

Synthesis and Evaluation of a Domain-specific Large Data Set for Dungeons Dragons

This paper introduces the Forgotten Realms Wiki (FRW) data set and domai...
research
09/20/2023

Language-Oriented Communication with Semantic Coding and Knowledge Distillation for Text-to-Image Generation

By integrating recent advances in large language models (LLMs) and gener...

Please sign up or login with your details

Forgot password? Click here to reset