Optimal approximation of continuous functions by very deep ReLU networks
We prove that deep ReLU neural networks with conventional fully-connected architectures with W weights can approximate continuous ν-variate functions f with uniform error not exceeding a_νω_f(c_ν W^-2/ν), where ω_f is the modulus of continuity of f and a_ν, c_ν are some ν-dependent constants. This bound is tight. Our construction is inherently deep and nonlinear: the obtained approximation rate cannot be achieved by networks with fewer than Ω(W/ W) layers or by networks with weights continuously depending on f.
READ FULL TEXT