How (Not) To Train Your Neural Network Using the Information Bottleneck Principle

02/27/2018
by   Rana Ali Amjad, et al.
0

In this theory paper, we investigate training deep neural networks (DNNs) for classification via minimizing the information bottleneck (IB) functional. We show that, even if the joint distribution between continuous feature variables and the discrete class variable is known, the resulting optimization problem suffers from two severe issues: First, for deterministic DNNs, the IB functional is infinite for almost all weight matrices, making the optimization problem ill-posed. Second, the invariance of the IB functional under bijections prevents it from capturing desirable properties for classification, such as robustness, architectural simplicity, and simplicity of the learned representation. We argue that these issues are partly resolved for stochastic DNNs, DNNs that include a (hard or soft) decision rule, or by replacing the IB functional with related, but more well-behaved cost functions. We conclude that recent successes reported about training DNNs using the IB framework must be attributed to such solutions. As a side effect, our results imply limitations of the IB framework for the analysis of DNNs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2022

An Optimal Time Variable Learning Framework for Deep Neural Networks

Feature propagation in Deep Neural Networks (DNNs) can be associated to ...
research
06/19/2022

0/1 Deep Neural Networks via Block Coordinate Descent

The step function is one of the simplest and most natural activation fun...
research
06/21/2019

Theory of the Frequency Principle for General Deep Neural Networks

Along with fruitful applications of Deep Neural Networks (DNNs) to reali...
research
01/03/2020

Optimizing Wireless Systems Using Unsupervised and Reinforced-Unsupervised Deep Learning

Resource allocation and transceivers in wireless networks are usually de...
research
10/22/2020

Theory-based residual neural networks: A synergy of discrete choice models and deep neural networks

Researchers often treat data-driven and theory-driven models as two disp...
research
11/26/2018

A Differential Topological View of Challenges in Learning with Feedforward Neural Networks

Among many unsolved puzzles in theories of Deep Neural Networks (DNNs), ...
research
12/24/2020

Learning with Retrospection

Deep neural networks have been successfully deployed in various domains ...

Please sign up or login with your details

Forgot password? Click here to reset