Deep Network Approximation with Discrepancy Being Reciprocal of Width to Power of Depth
A new network with super approximation power is introduced. This network is built with Floor (⌊ x⌋) and ReLU (max{0,x}) activation functions and hence we call such networks as Floor-ReLU networks. It is shown by construction that Floor-ReLU networks with width max{d, 5N+13} and depth 64dL+3 can pointwise approximate a Lipschitz continuous function f on [0,1]^d with an exponential approximation rate 3μ√(d) N^-√(L), where μ is the Lipschitz constant of f. More generally for an arbitrary continuous function f on [0,1]^d with a modulus of continuity ω_f(·), the constructive approximation rate is ω_f(√(d) N^-√(L))+2ω_f(√(d))N^-√(L). As a consequence, this new network overcomes the curse of dimensionality in approximation power since this approximation order is essentially √(d) times a function of N and L independent of d.
READ FULL TEXT