Partial local entropy and anisotropy in deep weight spaces

07/17/2020
by   Daniele Musso, et al.
0

We refine a recently-proposed class of local entropic loss functions by restricting the smoothening regularization to only a subset of weights. The new loss functions are referred to as partial local entropies. They can adapt to the weight-space anisotropy, thus outperforming their isotropic counterparts. We support the theoretical analysis with experiments on image classification tasks performed with multi-layer, fully-connected neural networks. The present study suggests how to better exploit the anisotropic nature of deep landscapes and provides direct probes of the shape of the wide flat minima encountered by stochastic gradient descent algorithms. As a by-product, we observe an asymptotic dynamical regime at late training times where the temperature of all the layers obeys a common scaling rule.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro