Orthogonal and Idempotent Transformations for Learning Deep Neural Networks

07/19/2017
by   Jingdong Wang, et al.
0

Identity transformations, used as skip-connections in residual networks, directly connect convolutional layers close to the input and those close to the output in deep neural networks, improving information flow and thus easing the training. In this paper, we introduce two alternative linear transforms, orthogonal transformation and idempotent transformation. According to the definition and property of orthogonal and idempotent matrices, the product of multiple orthogonal (same idempotent) matrices, used to form linear transformations, is equal to a single orthogonal (idempotent) matrix, resulting in that information flow is improved and the training is eased. One interesting point is that the success essentially stems from feature reuse and gradient reuse in forward and backward propagation for maintaining the information during flow and eliminating the gradient vanishing problem because of the express way through skip-connections. We empirically demonstrate the effectiveness of the proposed two transformations: similar performance in single-branch networks and even superior in multi-branch networks in comparison to identity transformations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2023

Convolutional unitary or orthogonal recurrent neural networks

Recurrent neural networks are extremely powerful yet hard to train. One ...
research
05/18/2018

Norm-Preservation: Why Residual Networks Can Become Extremely Deep?

Augmenting deep neural networks with skip connections, as introduced in ...
research
11/28/2018

Shared Representational Geometry Across Neural Networks

Different neural networks trained on the same dataset often learn simila...
research
08/12/2021

Existence, Stability And Scalability Of Orthogonal Convolutional Neural Networks

Imposing orthogonal transformations between layers of a neural network h...
research
06/02/2022

Entangled Residual Mappings

Residual mappings have been shown to perform representation learning in ...
research
10/27/2022

On the biological plausibility of orthogonal initialisation for solving gradient instability in deep neural networks

Initialising the synaptic weights of artificial neural networks (ANNs) w...
research
10/30/2021

Neural Network based on Automatic Differentiation Transformation of Numeric Iterate-to-Fixedpoint

This work proposes a Neural Network model that can control its depth usi...

Please sign up or login with your details

Forgot password? Click here to reset