Compression phase is not necessary for generalization in representation learning

02/15/2021
by   Sungyeop Lee, et al.
0

The outstanding performance of deep learning in various fields has been a fundamental query, which can be potentially examined using information theory that interprets the learning process as the transmission and compression of information. Information plane analyses of the mutual information between the input-hidden-output layers demonstrated two distinct learning phases of fitting and compression. It is debatable if the compression phase is necessary to generalize the input-output relations extracted from training data. In this study, we investigated this through experiments with various species of autoencoders and evaluated their information processing phase with an accurate kernel-based estimator of mutual information. Given sufficient training data, vanilla autoencoders demonstrated the compression phase, which was amplified after imposing sparsity regularization for hidden activities. However, we found that the compression phase is not universally observed in different species of autoencoders, including variational autoencoders, that have special constraints on network weights or manifold of hidden space. These types of autoencoders exhibited perfect generalization ability for test data without requiring the compression phase. Thus, we conclude that the compression phase is not necessary for generalization in representation learning.

READ FULL TEXT

page 9

page 21

research
06/24/2021

Information Bottleneck: Exact Analysis of (Quantized) Neural Networks

The information bottleneck (IB) principle has been suggested as a way to...
research
01/24/2020

Reasoning About Generalization via Conditional Mutual Information

We provide an information-theoretic framework for studying the generaliz...
research
12/09/2021

Latent Space Explanation by Intervention

The success of deep neural nets heavily relies on their ability to encod...
research
05/20/2022

Towards Understanding Grokking: An Effective Theory of Representation Learning

We aim to understand grokking, a phenomenon where models generalize long...
research
08/29/2018

Autoencoders, Kernels, and Multilayer Perceptrons for Electron Micrograph Restoration and Compression

We present 14 autoencoders, 15 kernels and 14 multilayer perceptrons for...
research
05/15/2020

On the Information Plane of Autoencoders

The training dynamics of hidden layers in deep learning are poorly under...
research
09/20/2020

Provable Finite Data Generalization with Group Autoencoder

Deep Autoencoders (AEs) provide a versatile framework to learn a compres...

Please sign up or login with your details

Forgot password? Click here to reset