On Residual Networks Learning a Perturbation from Identity

02/11/2019
by   Michael Hauser, et al.
0

The purpose of this work is to test and study the hypothesis that residual networks are learning a perturbation from identity. Residual networks are enormously important deep learning models, with many theories attempting to explain how they function; learning a perturbation from identity is one such theory. In order to answer this question, the magnitudes of the perturbations are measured in both an absolute sense as well as in a scaled sense, with each form having its relative benefits and drawbacks. Additionally, a stopping rule is developed that can be used to decide the depth of the residual network based on the average perturbation magnitude being less than a given epsilon. With this analysis a better understanding of how residual networks process and transform data from input to output is formed. Parallel experiments are conducted on MNIST as well as CIFAR10 for various sized residual networks with between 6 and 300 residual blocks. It is found that, in this setting, the average scaled perturbation magnitude is roughly inversely proportional to increasing the number of residual blocks, and from this it follows that for sufficiently large residual networks, they are learning a perturbation from identity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2016

Multi-Residual Networks: Improving the Speed and Accuracy of Residual Networks

In this article, we take one step toward understanding the learning beha...
research
11/14/2016

Identity Matters in Deep Learning

An emerging design principle in deep learning is that each layer of a de...
research
06/01/2018

Tandem Blocks in Deep Convolutional Neural Networks

Due to the success of residual networks (resnets) and related architectu...
research
07/24/2018

Competitive Inner-Imaging Squeeze and Excitation for Residual Network

Residual Network make the very deep convolutional architecture works wel...
research
11/23/2016

Deep Convolutional Neural Networks with Merge-and-Run Mappings

A deep residual network, built by stacking a sequence of residual blocks...
research
11/04/2016

Learning Identity Mappings with Residual Gates

We propose a new layer design by adding a linear gating mechanism to sho...

Please sign up or login with your details

Forgot password? Click here to reset