Optimal Latent Representations: Distilling Mutual Information into Principal Pairs

02/09/2019
by   Max Tegmark, et al.
0

Principal component analysis (PCA) is generalized from one to two random vectors, decomposing the correlations between them into a set of "principal pairs" of correlated scalars. For the special case of Gaussian random vectors, PCA decomposes the information content in a single Gaussian random vector into a mutually exclusive and collectively exhaustive set of information chunks corresponding to statistically independent numbers whose individual entropies add up to the total entropy. The proposed Principal Pair Analysis (PPA) generalizes this, decomposing the total mutual information two vectors as a sum of the mutual information between a set of independent pairs of numbers. This allows any two random vectors to be interpreted as the sum of a perfectly correlated ("signal") part and a perfectly uncorrelated ("noise") part. It is shown that when predicting the future of a system by mapping its state into a lower-dimensional latent space, it is optimal to use different mappings for present and future. As an example, it is shown that that PPA outperforms PCA for predicting the time-evolution of coupled harmonic oscillators with dissipation and thermal noise. We conjecture that a single latent representation is optimal only for time-reversible processes, not for e.g. text, speech, music or out-of-equilibrium physical systems.

READ FULL TEXT

page 1

page 4

research
03/14/2023

Informational Rescaling of PCA Maps with Application to Genetic Distance

We discuss the inadequacy of covariances/correlations and other measures...
research
05/08/2023

High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction

We study the problem of overcoming exponential sample complexity in diff...
research
06/24/2020

An ℓ_p theory of PCA and spectral clustering

Principal Component Analysis (PCA) is a powerful tool in statistics and ...
research
09/26/2018

Bayesian inference for PCA and MUSIC algorithms with unknown number of sources

Principal component analysis (PCA) is a popular method for projecting da...
research
07/11/2023

Latent Space Perspicacity and Interpretation Enhancement (LS-PIE) Framework

Linear latent variable models such as principal component analysis (PCA)...
research
05/30/2017

Decorrelation of Neutral Vector Variables: Theory and Applications

In this paper, we propose novel strategies for neutral vector variable d...

Please sign up or login with your details

Forgot password? Click here to reset