Probability-turbulence divergence: A tunable allotaxonometric instrument for comparing heavy-tailed categorical distributions

08/30/2020 ∙ by P. S. Dodds, et al. ∙ 0

Real-world complex systems often comprise many distinct types of elements as well as many more types of networked interactions between elements. When the relative abundances of types can be measured well, we further observe heavy-tailed categorical distributions for type frequencies. For the comparison of type frequency distributions of two systems or a system with itself at different time points in time – a facet of allotaxonometry – a great range of probability divergences are available. Here, we introduce and explore `probability-turbulence divergence', a tunable, straightforward, and interpretable instrument for comparing normalizable categorical frequency distributions. We model probability-turbulence divergence (PTD) after rank-turbulence divergence (RTD). While probability-turbulence divergence is more limited in application than rank-turbulence divergence, it is more sensitive to changes in type frequency. We build allotaxonographs to display probability turbulence, incorporating a way to visually accommodate zero probabilities for `exclusive types' which are types that appear in only one system. We explore comparisons of example distributions taken from literature, social media, and ecology. We show how probability-turbulence divergence either explicitly or functionally generalizes many existing kinds of distances and measures, including, as special cases, L^(p) norms, the Sørensen-Dice coefficient (the F_1 statistic), and the Hellinger distance. We discuss similarities with the generalized entropies of Rényi and Tsallis, and the diversity indices (or Hill numbers) from ecology. We close with thoughts on open problems concerning the optimization of the tuning of rank- and probability-turbulence divergence.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 7

page 8

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.