High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction

05/08/2023
βˆ™
by   Kristjan Greenewald, et al.
βˆ™
0
βˆ™

We study the problem of overcoming exponential sample complexity in differential entropy estimation under Gaussian convolutions. Specifically, we consider the estimation of the differential entropy h(X+Z) via n independently and identically distributed samples of X, where X and Z are independent D-dimensional random variables with X sub-Gaussian with bounded second moment and ZβˆΌπ’©(0,Οƒ^2I_D). Under the absolute-error loss, the above problem has a parametric estimation rate of c^D/√(n), which is exponential in data dimension D and often problematic for applications. We overcome this exponential sample complexity by projecting X to a low-dimensional space via principal component analysis (PCA) before the entropy estimation, and show that the asymptotic error overhead vanishes as the unexplained variance of the PCA vanishes. This implies near-optimal performance for inherently low-dimensional structures embedded in high-dimensional spaces, including hidden-layer outputs of deep neural networks (DNN), which can be used to estimate mutual information (MI) in DNNs. We provide numerical results verifying the performance of our PCA approach on Gaussian and spiral data. We also apply our method to analysis of information flow through neural network layers (c.f. information bottleneck), with results measuring mutual information in a noisy fully connected network and a noisy convolutional neural network (CNN) for MNIST classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
βˆ™ 03/14/2023

Informational Rescaling of PCA Maps with Application to Genetic Distance

We discuss the inadequacy of covariances/correlations and other measures...
research
βˆ™ 10/30/2018

Optimally Weighted PCA for High-Dimensional Heteroscedastic Data

Modern applications increasingly involve high-dimensional and heterogene...
research
βˆ™ 02/09/2019

Optimal Latent Representations: Distilling Mutual Information into Principal Pairs

Principal component analysis (PCA) is generalized from one to two random...
research
βˆ™ 01/12/2018

MINE: Mutual Information Neural Estimation

We argue that the estimation of the mutual information between high dime...
research
βˆ™ 05/21/2018

How Many Samples are Needed to Learn a Convolutional Neural Network?

A widespread folklore for explaining the success of convolutional neural...
research
βˆ™ 10/02/2012

Distributed High Dimensional Information Theoretical Image Registration via Random Projections

Information theoretical measures, such as entropy, mutual information, a...
research
βˆ™ 02/14/2022

KNIFE: Kernelized-Neural Differential Entropy Estimation

Mutual Information (MI) has been widely used as a loss regularizer for t...

Please sign up or login with your details

Forgot password? Click here to reset