Log In Sign Up

Neural tangent kernels, transportation mappings, and universal approximation

by   Ziwei Ji, et al.

This paper establishes rates of universal approximation for the shallow neural tangent kernel (NTK): network weights are only allowed microscopic changes from random initialization, which entails that activations are mostly unchanged, and the network is nearly equivalent to its linearization. Concretely, the paper has two main contributions: a generic scheme to approximate functions with the NTK by sampling from transport mappings between the initial weights and their desired values, and the construction of transport mappings via Fourier transforms. Regarding the first contribution, the proof scheme provides another perspective on how the NTK regime arises from rescaling: redundancy in the weights due to resampling allows individual weights to be scaled down. Regarding the second contribution, the most notable transport mapping asserts that roughly 1 / δ^10d nodes are sufficient to approximate continuous functions, where δ depends on the continuity properties of the target function. By contrast, nearly the same proof yields a bound of 1 / δ^2d for shallow ReLU networks; this gap suggests a tantalizing direction for future work, separating shallow ReLU networks and their linearization.


page 1

page 2

page 3

page 4


Optimal approximation of continuous functions by very deep ReLU networks

We prove that deep ReLU neural networks with conventional fully-connecte...

Random ReLU Features: Universality, Approximation, and Composition

We propose random ReLU features models in this work. Its motivation is r...

Deep Equals Shallow for ReLU Networks in Kernel Regimes

Deep networks are often considered to be more expressive than shallow on...

Efficient Approximation of Solutions of Parametric Linear Transport Equations by ReLU DNNs

We demonstrate that deep neural networks with the ReLU activation functi...

Universality and approximation bounds for echo state networks with random weights

We study the uniform approximation of echo state networks with randomly ...

Gradient Dynamics of Shallow Univariate ReLU Networks

We present a theoretical and empirical study of the gradient dynamics of...

Stochastic Feedforward Neural Networks: Universal Approximation

In this chapter we take a look at the universal approximation question f...