Fine-grained Optimization of Deep Neural Networks

05/22/2019
by   M. Ozay, et al.
0

In recent studies, several asymptotic upper bounds on generalization errors on deep neural networks (DNNs) are theoretically derived. These bounds are functions of several norms of weights of the DNNs, such as the Frobenius and spectral norms, and they are computed for weights grouped according to either input and output channels of the DNNs. In this work, we conjecture that if we can impose multiple constraints on weights of DNNs to upper bound the norms of the weights, and train the DNNs with these weights, then we can attain empirical generalization errors closer to the derived theoretical bounds, and improve accuracy of the DNNs. To this end, we pose two problems. First, we aim to obtain weights whose different norms are all upper bounded by a constant number, e.g. 1.0. To achieve these bounds, we propose a two-stage renormalization procedure; (i) normalization of weights according to different norms used in the bounds, and (ii) reparameterization of the normalized weights to set a constant and finite upper bound of their norms. In the second problem, we consider training DNNs with these renormalized weights. To this end, we first propose a strategy to construct joint spaces (manifolds) of weights according to different constraints in DNNs. Next, we propose a fine-grained SGD algorithm (FG-SGD) for optimization on the weight manifolds to train DNNs with assurance of convergence to minima. Experimental results show that image classification accuracy of baseline DNNs can be boosted using FG-SGD on collections of manifolds identified by multiple constraints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2020

Analyzing Upper Bounds on Mean Absolute Errors for Deep Neural Network Based Vector-to-Vector Regression

In this paper, we show that, in vector-to-vector regression utilizing de...
research
02/06/2021

The Implicit Biases of Stochastic Gradient Descent on Deep Neural Networks with Batch Normalization

Deep neural networks with batch normalization (BN-DNNs) are invariant to...
research
01/12/2022

On generalization bounds for deep networks based on loss surface implicit regularization

The classical statistical learning theory says that fitting too many par...
research
06/30/2020

Approximation Rates for Neural Networks with Encodable Weights in Smoothness Spaces

We examine the necessary and sufficient complexity of neural networks to...
research
08/07/2020

Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations

Using weight decay to penalize the L2 norms of weights in neural network...
research
06/26/2017

Spectrally-normalized margin bounds for neural networks

This paper presents a margin-based multiclass generalization bound for n...
research
01/22/2017

Optimization on Product Submanifolds of Convolution Kernels

Recent advances in optimization methods used for training convolutional ...

Please sign up or login with your details

Forgot password? Click here to reset