Compressing Heavy-Tailed Weight Matrices for Non-Vacuous Generalization Bounds

05/23/2021
by   John Y. Shin, et al.
0

Heavy-tailed distributions have been studied in statistics, random matrix theory, physics, and econometrics as models of correlated systems, among other domains. Further, heavy-tail distributed eigenvalues of the covariance matrix of the weight matrices in neural networks have been shown to empirically correlate with test set accuracy in several works (e.g. arXiv:1901.08276), but a formal relationship between heavy-tail distributed parameters and generalization bounds was yet to be demonstrated. In this work, the compression framework of arXiv:1802.05296 is utilized to show that matrices with heavy-tail distributed matrix elements can be compressed, resulting in networks with sparse weight matrices. Since the parameter count has been reduced to a sum of the non-zero elements of sparse matrices, the compression framework allows us to bound the generalization gap of the resulting compressed network with a non-vacuous generalization bound. Further, the action of these matrices on a vector is discussed, and how they may relate to compression and resilient classification is analyzed.

READ FULL TEXT
research
04/06/2023

Heavy-Tailed Regularization of Weight Matrices in Deep Neural Networks

Unraveling the reasons behind the remarkable success and exceptional gen...
research
10/18/2022

Dimension-free Bounds for Sum of Dependent Matrices and Operators with Heavy-Tailed Distribution

We study the deviation inequality for a sum of high-dimensional random m...
research
03/01/2021

Local Tail Statistics of Heavy-Tailed Random Matrix Ensembles with Unitary Invariance

We study heavy-tailed Hermitian random matrices that are unitarily invar...
research
02/06/2022

Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data

The search for effective and robust generalization metrics has been the ...
research
04/27/2023

Optimal Covariance Cleaning for Heavy-Tailed Distributions: Insights from Information Theory

In optimal covariance cleaning theory, minimizing the Frobenius norm bet...
research
05/07/2023

Affine equivariant Tyler's M-estimator applied to tail parameter learning of elliptical distributions

We propose estimating the scale parameter (mean of the eigenvalues) of t...
research
10/08/2019

On Dimension-free Tail Inequalities for Sums of Random Matrices and Applications

In this paper, we present a new framework to obtain tail inequalities fo...

Please sign up or login with your details

Forgot password? Click here to reset