Redundancy of Hidden Layers in Deep Learning: An Information Perspective

09/19/2020
by   Chenguang Zhang, et al.
0

Although the deep structure guarantees the powerful expressivity of deep networks (DNNs), it also triggers serious overfitting problem. To improve the generalization capacity of DNNs, many strategies were developed to improve the diversity among hidden units. However, most of these strategies are empirical and heuristic in absence of either a theoretical derivation of the diversity measure or a clear connection from the diversity to the generalization capacity. In this paper, from an information theoretic perspective, we introduce a new definition of redundancy to describe the diversity of hidden units under supervised learning settings by formalizing the effect of hidden layers on the generalization capacity as the mutual information. We prove an opposite relationship existing between the defined redundancy and the generalization capacity, i.e., the decrease of redundancy generally improving the generalization capacity. The experiments show that the DNNs using the redundancy as the regularizer can effectively reduce the overfitting and decrease the generalization error, which well supports above points.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2021

A Probabilistic Representation of DNNs: Bridging Mutual Information and Generalization

Recently, Mutual Information (MI) has attracted attention in bounding th...
research
10/12/2020

An Information-Theoretic Perspective on Overfitting and Underfitting

We present an information-theoretic framework for understanding overfitt...
research
06/02/2023

On Feature Diversity in Energy-based Models

Energy-based learning is a powerful learning paradigm that encapsulates ...
research
05/27/2019

CGaP: Continuous Growth and Pruning for Efficient Deep Learning

Today a canonical approach to reduce the computation cost of Deep Neural...
research
05/02/2017

Redundancy in active paths of deep networks: a random active path model

Deep learning has become a powerful and popular tool for a variety of ma...
research
08/26/2019

A Probabilistic Representation of Deep Learning

In this work, we introduce a novel probabilistic representation of deep ...
research
03/12/2022

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy

Vision transformers (ViTs) have gained increasing popularity as they are...

Please sign up or login with your details

Forgot password? Click here to reset