Where is the Information in a Deep Neural Network?

05/29/2019
by   Alessandro Achille, et al.
0

Whatever information a Deep Neural Network has gleaned from past data is encoded in its weights. How this information affects the response of the network to future data is largely an open question. In fact, even how to define and measure information in a network is still not settled. We introduce the notion of Information in the Weights as the optimal trade-off between accuracy of the network and complexity of the weights, relative to a prior. Depending on the prior, the definition reduces to known information measures such as Shannon Mutual Information and Fisher Information, but affords added flexibility that enables us to relate it to generalization, via the PAC-Bayes bound, and to invariance. This relation hinges not only on the architecture of the model, but surprisingly on how it is trained. We then introduce a notion of effective information in the activations, which are deterministic functions of future inputs, resolving inconsistencies in prior work. We relate this to the Information in the Weights, and use this result to show that models of low complexity not only generalize better, but are bound to learn invariant representations of future inputs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2017

Emergence of Invariance and Disentangling in Deep Representations

Using established principles from Information Theory and Statistics, we ...
research
09/29/2021

PAC-Bayes Information Bottleneck

Information bottleneck (IB) depicts a trade-off between the accuracy and...
research
04/28/2023

Recognizable Information Bottleneck

Information Bottlenecks (IBs) learn representations that generalize to u...
research
02/13/2021

Fluctuation-response theorem for Kullback-Leibler divergences to quantify causation

We define a new measure of causation from a fluctuation-response theorem...
research
02/19/2020

Improving Generalization by Controlling Label-Noise Information in Neural Network Weights

In the presence of noisy or incorrect labels, neural networks have the u...
research
04/16/2018

A Direct Sum Result for the Information Complexity of Learning

How many bits of information are required to PAC learn a class of hypoth...
research
01/17/2021

Estimating informativeness of samples with Smooth Unique Information

We define a notion of information that an individual sample provides to ...

Please sign up or login with your details

Forgot password? Click here to reset