Optimizing Neural Networks in the Equivalent Class Space

02/11/2018
by   Qi Meng, et al.
0

It has been widely observed that many activation functions and pooling methods of neural network models have (positive-) rescaling-invariant property, including ReLU, PReLU, max-pooling, and average pooling, which makes fully-connected neural networks (FNNs) and convolutional neural networks (CNNs) invariant to (positive) rescaling operation across layers. This may cause unneglectable problems with their optimization: (1) different NN models could be equivalent, but their gradients can be very different from each other; (2) it can be proven that the loss functions may have many spurious critical points in the redundant weight space. To tackle these problems, in this paper, we first characterize the rescaling-invariant properties of NN models using equivalent classes and prove that the dimension of the equivalent class space is significantly smaller than the dimension of the original weight space. Then we represent the loss function in the compact equivalent class space and develop novel algorithms that conduct optimization of the NN models directly in the equivalent class space. We call these algorithms Equivalent Class Optimization (abbreviated as EC-Opt) algorithms. Moreover, we design efficient tricks to compute the gradients in the equivalent class, which almost have no extra computational complexity as compared to standard back-propagation (BP). We conducted experimental study to demonstrate the effectiveness of our proposed new optimization algorithms. In particular, we show that by using the idea of EC-Opt, we can significantly improve the accuracy of the learned model (for both FNN and CNN), as compared to using conventional stochastic gradient descent algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/07/2021

Revisiting Recursive Least Squares for Training Deep Neural Networks

Recursive least squares (RLS) algorithms were once widely used for train...
research
01/08/2021

Towards Accelerating Training of Batch Normalization: A Manifold Perspective

Batch normalization (BN) has become a crucial component across diverse d...
research
06/16/2017

A Fully Trainable Network with RNN-based Pooling

Pooling is an important component in convolutional neural networks (CNNs...
research
03/13/2020

Balancedness and Alignment are Unlikely in Linear Neural Networks

We study the invariance properties of alignment in linear neural network...
research
05/22/2023

DeepBern-Nets: Taming the Complexity of Certifying Neural Networks using Bernstein Polynomial Activations and Precise Bound Propagation

Formal certification of Neural Networks (NNs) is crucial for ensuring th...
research
11/20/2018

Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training

Despite the recent successes of deep neural networks, the corresponding ...
research
07/17/2020

Partial local entropy and anisotropy in deep weight spaces

We refine a recently-proposed class of local entropic loss functions by ...

Please sign up or login with your details

Forgot password? Click here to reset