A Probabilistic Representation of Deep Learning for Improving The Information Theoretic Interpretability

10/27/2020
by   Xinjie Lan, et al.
13

In this paper, we propose a probabilistic representation of MultiLayer Perceptrons (MLPs) to improve the information-theoretic interpretability. Above all, we demonstrate that the activations being i.i.d. is not valid for all the hidden layers of MLPs, thus the existing mutual information estimators based on non-parametric inference methods, e.g., empirical distributions and Kernel Density Estimate (KDE), are invalid for measuring the information flow in MLPs. Moreover, we introduce explicit probabilistic explanations for MLPs: (i) we define the probability space (Omega_F, t, P_F) for a fully connected layer f and demonstrate the great effect of an activation function on the probability measure P_F ; (ii) we prove the entire architecture of MLPs as a Gibbs distribution P; and (iii) the back-propagation aims to optimize the sample space Omega_F of all the fully connected layers of MLPs for learning an optimal Gibbs distribution P* to express the statistical connection between the input and the label. Based on the probabilistic explanations for MLPs, we improve the information-theoretic interpretability of MLPs in three aspects: (i) the random variable of f is discrete and the corresponding entropy is finite; (ii) the information bottleneck theory cannot correctly explain the information flow in MLPs if we take into account the back-propagation; and (iii) we propose novel information-theoretic explanations for the generalization of MLPs. Finally, we demonstrate the proposed probabilistic representation and information-theoretic explanations for MLPs in a synthetic dataset and benchmark datasets.

READ FULL TEXT

page 4

page 8

page 10

page 11

page 20

research
02/13/2021

Fluctuation-response theorem for Kullback-Leibler divergences to quantify causation

We define a new measure of causation from a fluctuation-response theorem...
research
08/26/2019

A Probabilistic Representation of Deep Learning

In this work, we introduce a novel probabilistic representation of deep ...
research
06/18/2021

A Probabilistic Representation of DNNs: Bridging Mutual Information and Generalization

Recently, Mutual Information (MI) has attracted attention in bounding th...
research
04/11/2022

Information in probability: Another information-theoretic proof of a finite de Finetti theorem

We recall some of the history of the information-theoretic approach to d...
research
06/30/2019

Estimating Information-Theoretic Quantities with Random Forests

Information-theoretic quantities, such as mutual information and conditi...
research
06/29/2017

Comparing Information-Theoretic Measures of Complexity in Boltzmann Machines

In the past three decades, many theoretical measures of complexity have ...
research
02/24/2022

Estimators of Entropy and Information via Inference in Probabilistic Models

Estimating information-theoretic quantities such as entropy and mutual i...

Please sign up or login with your details

Forgot password? Click here to reset