DeepAI AI Chat
Log In Sign Up

Local Intrinsic Dimensional Entropy

by   Rohan Ghosh, et al.

Most entropy measures depend on the spread of the probability distribution over the sample space X, and the maximum entropy achievable scales proportionately with the sample space cardinality |X|. For a finite |X|, this yields robust entropy measures which satisfy many important properties, such as invariance to bijections, while the same is not true for continuous spaces (where |X|=infinity). Furthermore, since R and R^d (d in Z+) have the same cardinality (from Cantor's correspondence argument), cardinality-dependent entropy measures cannot encode the data dimensionality. In this work, we question the role of cardinality and distribution spread in defining entropy measures for continuous spaces, which can undergo multiple rounds of transformations and distortions, e.g., in neural networks. We find that the average value of the local intrinsic dimension of a distribution, denoted as ID-Entropy, can serve as a robust entropy measure for continuous spaces, while capturing the data dimensionality. We find that ID-Entropy satisfies many desirable properties and can be extended to conditional entropy, joint entropy and mutual-information variants. ID-Entropy also yields new information bottleneck principles and also links to causality. In the context of deep learning, for feedforward architectures, we show, theoretically and empirically, that the ID-Entropy of a hidden layer directly controls the generalization gap for both classifiers and auto-encoders, when the target function is Lipschitz continuous. Our work primarily shows that, for continuous spaces, taking a structural rather than a statistical approach yields entropy measures which preserve intrinsic data dimensionality, while being relevant for studying various architectures.


page 1

page 2

page 3

page 4


Properties of Minimizing Entropy

Compact data representations are one approach for improving generalizati...

On dynamical measures of quantum information

In this work, we use the theory of quantum states over time to define an...

Stochastic Deep Networks

Machine learning is increasingly targeting areas where input data cannot...

Generalized active information: extensions to unbounded domains

In the last three decades, several measures of complexity have been prop...

Local intrinsic dimensionality estimators based on concentration of measure

Intrinsic dimensionality (ID) is one of the most fundamental characteris...

Intrinsic dimension estimation for discrete metrics

Real world-datasets characterized by discrete features are ubiquitous: f...

Exact simulation of continuous max-id processes

We provide two algorithms for the exact simulation of exchangeable max-(...