Representational Rényi heterogeneity

12/10/2019
by   Abraham Nunes, et al.
30

A discrete system's heterogeneity is measured by the Rényi heterogeneity family of indices (also known as Hill numbers or Hannah-Kay indices), whose units are known as the numbers equivalent, and whose scaling properties are consistent and intuitive. Unfortunately, numbers equivalent heterogeneity measures for non-categorical data require a priori (A) categorical partitioning and (B) pairwise distance measurement on the space of observable data. This precludes their application to problems in disciplines where categories are ill-defined or where semantically relevant features must be learned as abstractions from some data. We thus introduce representational Rényi heterogeneity (RRH), which transforms an observable domain onto a latent space upon which the Rényi heterogeneity is both tractable and semantically relevant. This method does not require a priori binning nor definition of a distance function on the observable space. Compared with existing state-of-the-art indices on a beta-mixture distribution, we show that RRH more accurately detects the number of distinct mixture components. We also show that RRH can measure heterogeneity in natural images whose semantically relevant features must be abstracted using deep generative models. We further show that RRH can uniquely capture heterogeneity caused by distinct components in mixture distributions. Our novel approach will enable measurement of heterogeneity in disciplines where a priori categorical partitions of observable data are not possible, or where semantically relevant features must be inferred using latent variable models.

READ FULL TEXT

page 9

page 10

page 12

page 13

research
02/22/2020

On the Multiplicative Decomposition of Heterogeneity in Continuous Assemblages

A system's heterogeneity (equivalently, diversity) amounts to the effect...
research
06/17/2020

Categorical Normalizing Flows via Continuous Transformations

Despite their popularity, to date, the application of normalizing flows ...
research
06/21/2020

VAEM: a Deep Generative Model for Heterogeneous Mixed Type Data

Deep generative models often perform poorly in real-world applications d...
research
10/10/2019

Latent Dirichlet Analysis of Categorical Survey Responses

Data from surveys are increasingly available as the internet provides a ...
research
08/24/2017

GALILEO: A Generalized Low-Entropy Mixture Model

We present a new method of generating mixture models for data with categ...
research
05/28/2023

Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling

Learning to denoise has emerged as a prominent paradigm to design state-...
research
04/14/2015

A data-based classification of Slavic languages: Indices of qualitative variation applied to grapheme frequencies

The Ord's graph is a simple graphical method for displaying frequency di...

Please sign up or login with your details

Forgot password? Click here to reset