Is the Skip Connection Provable to Reform the Neural Network Loss Landscape?

06/10/2020
by   Lifu Wang, et al.
9

The residual network is now one of the most effective structures in deep learning, which utilizes the skip connections to “guarantee" the performance will not get worse. However, the non-convexity of the neural network makes it unclear whether the skip connections do provably improve the learning ability since the nonlinearity may create many local minima. In some previous works <cit.>, it is shown that despite the non-convexity, the loss landscape of the two-layer ReLU network has good properties when the number m of hidden nodes is very large. In this paper, we follow this line to study the topology (sub-level sets) of the loss landscape of deep ReLU neural networks with a skip connection and theoretically prove that the skip connection network inherits the good properties of the two-layer network and skip connections can help to control the connectedness of the sub-level sets, such that any local minima worse than the global minima of some two-layer ReLU network will be very “shallow". The “depth" of these local minima are at most O(m^(η-1)/n), where n is the input dimension, η<1. This provides a theoretical explanation for the effectiveness of the skip connection in deep learning.

READ FULL TEXT
research
05/22/2018

Adding One Neuron Can Eliminate All Bad Local Minima

One of the main difficulties in analyzing neural networks is the non-con...
research
10/21/2019

Universal flow approximation with deep residual networks

Residual networks (ResNets) are a deep learning architecture with the re...
research
01/31/2017

Skip Connections Eliminate Singularities

Skip connections made the training of very deep networks possible and ha...
research
05/15/2021

Rethinking Skip Connection with Layer Normalization in Transformers and ResNets

Skip connection, is a widely-used technique to improve the performance a...
research
05/06/2021

The layer-wise L1 Loss Landscape of Neural Nets is more complex around local minima

For fixed training data and network parameters in the other layers the L...
research
10/09/2022

SML:Enhance the Network Smoothness with Skip Meta Logit for CTR Prediction

In light of the smoothness property brought by skip connections in ResNe...
research
10/10/2022

A global analysis of global optimisation

Theoretical understanding of the training of deep neural networks has ma...

Please sign up or login with your details

Forgot password? Click here to reset