Gaussian Lower Bound for the Information Bottleneck Limit

11/07/2017
by   Amichai Painsky, et al.
0

The Information Bottleneck (IB) is a conceptual method for extracting the most compact, yet informative, representation of a set of variables, with respect to the target. It generalizes the notion of minimal sufficient statistics from classical parametric statistics to a broader information-theoretic sense. The IB curve defines the optimal trade-off between representation complexity and its predictive power. Specifically, it is achieved by minimizing the level of mutual information (MI) between the representation and the original variables, subject to a minimal level of MI between the representation and the target. This problem is shown to be in general NP hard. One important exception is the multivariate Gaussian case, for which the Gaussian IB (GIB) is known to obtain an analytical closed form solution, similar to Canonical Correlation Analysis (CCA). In this work we introduce a Gaussian lower bound to the IB curve; we find an embedding of the data which maximizes its "Gaussian part", on which we apply the GIB. This embedding provides an efficient (and practical) representation of any arbitrary data-set (in the IB sense), which in addition holds the favorable properties of a Gaussian distribution. Importantly, we show that the optimal Gaussian embedding is bounded from above by non-linear CCA. This allows a fundamental limit for our ability to Gaussianize arbitrary data-sets and solve complex problems by linear methods.

READ FULL TEXT
research
10/31/2018

An Information-Theoretic Framework for Non-linear Canonical Correlation Analysis

Canonical Correlation Analysis (CCA) is a linear representation learning...
research
03/31/2023

Generalized Information Bottleneck for Gaussian Variables

The information bottleneck (IB) method offers an attractive framework fo...
research
04/19/2018

A Simple Capacity Lower Bound for Communication with Superimposed Pilots

We present a novel closed-form lower bound on the Gaussian-input mutual ...
research
11/12/2020

Bottleneck Problems: Information and Estimation-Theoretic View

Information bottleneck (IB) and privacy funnel (PF) are two closely rela...
research
09/16/2018

Linear Independent Component Analysis over Finite Fields: Algorithms and Bounds

Independent Component Analysis (ICA) is a statistical tool that decompos...
research
08/22/2022

On Information Bottleneck for Gaussian Processes

The information bottleneck problem (IB) of jointly stationary Gaussian s...
research
07/13/2018

Conditional Masking to Numerical Data

Protecting the privacy of data-sets has become hugely important these da...

Please sign up or login with your details

Forgot password? Click here to reset