Neural Network Architecture Beyond Width and Depth

05/19/2022
by   Zuowei Shen, et al.
0

This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth. Neural network architectures with height, width, and depth as hyperparameters are called three-dimensional architectures. It is shown that neural networks with three-dimensional architectures are significantly more expressive than the ones with two-dimensional architectures (those with only width and depth as hyperparameters), e.g., standard fully connected networks. The new network architecture is constructed recursively via a nested structure, and hence we call a network with the new architecture nested network (NestNet). A NestNet of height s is built with each hidden neuron activated by a NestNet of height ≤ s-1. When s=1, a NestNet degenerates to a standard network with a two-dimensional architecture. It is proved by construction that height-s ReLU NestNets with 𝒪(n) parameters can approximate Lipschitz continuous functions on [0,1]^d with an error 𝒪(n^-(s+1)/d), while the optimal approximation error of standard ReLU networks with 𝒪(n) parameters is 𝒪(n^-2/d). Furthermore, such a result is extended to generic continuous functions on [0,1]^d with the approximation error characterized by the modulus of continuity. Finally, a numerical example is provided to explore the advantages of the super approximation power of ReLU NestNets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2021

Optimal Approximation Rate of ReLU Networks in terms of Width and Depth

This paper concentrates on the approximation power of deep feed-forward ...
research
09/03/2020

Error estimate for a universal function approximator of ReLU network with a local connection

Neural networks have shown high successful performance in a wide range o...
research
02/25/2021

Power series expansion neural network

In this paper, we develop a new neural network family based on power ser...
research
08/06/2020

On the Accuracy of CRNNs for Line-Based OCR: A Multi-Parameter Evaluation

We investigate how to train a high quality optical character recognition...
research
07/19/2023

Deep Operator Network Approximation Rates for Lipschitz Operators

We establish universality and expression rate bounds for a class of neur...
research
04/03/2022

Correlation Functions in Random Fully Connected Neural Networks at Finite Width

This article considers fully connected neural networks with Gaussian ran...
research
06/02/2023

Network Degeneracy as an Indicator of Training Performance: Comparing Finite and Infinite Width Angle Predictions

Neural networks are powerful functions with widespread use, but the theo...

Please sign up or login with your details

Forgot password? Click here to reset