Log In Sign Up

Doubly infinite residual networks: a diffusion process approach

by   Stefano Peluchetti, et al.

When neural network's parameters are initialized as i.i.d., neural networks exhibit undesirable forward and backward properties as the number of layers increases, e.g., vanishing dependency on the input, and perfectly correlated outputs for any two inputs. To overcome these drawbacks Peluchetti and Favaro (2020) considered fully connected residual networks (ResNets) with parameters' distributions that shrink as the number of layers increases. In particular, they established an interplay between infinitely deep ResNets and solutions to stochastic differential equations, i.e. diffusion processes, showing that infinitely deep ResNets does not suffer from undesirable forward properties. In this paper, we review the forward-propagation results of Peluchetti and Favaro (2020), extending them to the setting of convolutional ResNets. Then, we study analogous backward-propagation results, which directly relate to the problem of training deep ResNets. Finally, we extend our study to the doubly infinite regime where both network's width and depth grow unboundedly. Within this novel regime the dynamics of quantities of interest converge, at initialization, to deterministic limits. This allow us to provide analytical expressions for inference, both in the case of weakly trained and fully trained networks. These results point to a limited expressive power of doubly infinite ResNets when the unscaled parameters are i.i.d, and residual blocks are shallow.


Neural Stochastic Differential Equations

Deep neural networks whose parameters are distributed according to typic...

Asymptotic Analysis of Deep Residual Networks

We investigate the asymptotic properties of deep Residual networks (ResN...

On the infinite-depth limit of finite-width neural networks

In this paper, we study the infinite-depth limit of finite-width residua...

Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime

We provide quantitative bounds measuring the L^2 difference in function ...

Robust learning with implicit residual networks

In this effort we propose a new deep architecture utilizing residual blo...

Infinite-channel deep stable convolutional neural networks

The interplay between infinite-width neural networks (NNs) and classes o...

Mean Field Residual Networks: On the Edge of Chaos

We study randomly initialized residual networks using mean field theory ...