Neural tangent kernels, transportation mappings, and universal approximation

10/15/2019
by   Ziwei Ji, et al.
0

This paper establishes rates of universal approximation for the shallow neural tangent kernel (NTK): network weights are only allowed microscopic changes from random initialization, which entails that activations are mostly unchanged, and the network is nearly equivalent to its linearization. Concretely, the paper has two main contributions: a generic scheme to approximate functions with the NTK by sampling from transport mappings between the initial weights and their desired values, and the construction of transport mappings via Fourier transforms. Regarding the first contribution, the proof scheme provides another perspective on how the NTK regime arises from rescaling: redundancy in the weights due to resampling allows individual weights to be scaled down. Regarding the second contribution, the most notable transport mapping asserts that roughly 1 / δ^10d nodes are sufficient to approximate continuous functions, where δ depends on the continuity properties of the target function. By contrast, nearly the same proof yields a bound of 1 / δ^2d for shallow ReLU networks; this gap suggests a tantalizing direction for future work, separating shallow ReLU networks and their linearization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2023

Any Deep ReLU Network is Shallow

We constructively prove that every deep ReLU network can be rewritten as...
research
02/10/2018

Optimal approximation of continuous functions by very deep ReLU networks

We prove that deep ReLU neural networks with conventional fully-connecte...
research
10/10/2018

Random ReLU Features: Universality, Approximation, and Composition

We propose random ReLU features models in this work. Its motivation is r...
research
09/30/2020

Deep Equals Shallow for ReLU Networks in Kernel Regimes

Deep networks are often considered to be more expressive than shallow on...
research
05/20/2021

Neural networks with superexpressive activations and integer weights

An example of an activation function σ is given such that networks with ...
research
05/20/2023

Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions

In recent years, deep learning-based sequence modelings, such as languag...
research
06/12/2022

Universality and approximation bounds for echo state networks with random weights

We study the uniform approximation of echo state networks with randomly ...

Please sign up or login with your details

Forgot password? Click here to reset