Deep Stable neural networks: large-width asymptotics and convergence rates

08/02/2021

∙

In modern deep learning, there is a recent and growing literature on the interplay between large-width asymptotics for deep Gaussian neural networks (NNs), i.e. deep NNs with Gaussian-distributed weights, and classes of Gaussian stochastic processes (SPs). Such an interplay has proved to be critical in several contexts of practical interest, e.g. Bayesian inference under Gaussian SP priors, kernel regression for infinite-wide deep NNs trained via gradient descent, and information propagation within infinite-wide NNs. Motivated by empirical analysis, showing the potential of replacing Gaussian distributions with Stable distributions for the NN's weights, in this paper we investigate large-width asymptotics for (fully connected) feed-forward deep Stable NNs, i.e. deep NNs with Stable-distributed weights. First, we show that as the width goes to infinity jointly over the NN's layers, a suitable rescaled deep Stable NN converges weakly to a Stable SP whose distribution is characterized recursively through the NN's layers. Because of the non-triangular NN's structure, this is a non-standard asymptotic problem, to which we propose a novel and self-contained inductive approach, which may be of independent interest. Then, we establish sup-norm convergence rates of a deep Stable NN to a Stable SP, quantifying the critical difference between the settings of “joint growth" and “sequential growth" of the width over the NN's layers. Our work extends recent results on infinite-wide limits for deep Gaussian NNs to the more general deep Stable NNs, providing the first result on convergence rates for infinite-wide deep NNs.

READ FULL TEXT

Deep Stable neural networks: large-width asymptotics and convergence rates

Infinitely wide limits for deep Stable neural networks: sub-linear, linear and super-linear activation functions

Non-asymptotic approximations of Gaussian neural networks via second-order Poincaré inequalities

Neural tangent kernel analysis of shallow α-Stable ReLU neural networks

Infinite-channel deep stable convolutional neural networks

On the Equivalence between Neural Network and Support Vector Machine

Finite size corrections for neural network Gaussian processes

Characteristics of Monte Carlo Dropout in Wide Neural Networks

Deep Stable neural networks: large-width asymptotics and convergence rates

Related Research

Infinitely wide limits for deep Stable neural networks: sub-linear, linear and super-linear activation functions

Non-asymptotic approximations of Gaussian neural networks via second-order Poincaré inequalities

Neural tangent kernel analysis of shallow α-Stable ReLU neural networks

Infinite-channel deep stable convolutional neural networks

On the Equivalence between Neural Network and Support Vector Machine

Finite size corrections for neural network Gaussian processes

Characteristics of Monte Carlo Dropout in Wide Neural Networks