Best k-layer neural network approximations

07/02/2019
by   Lek-Heng Lim, et al.
0

We investigate the geometry of the empirical risk minimization problem for k-layer neural networks. We will provide examples showing that for the classical activation functions σ(x)= 1/(1 + (-x)) and σ(x)=(x), there exists a positive-measured subset of target functions that do not have best approximations by a fixed number of layers of neural networks. In addition, we study in detail the properties of shallow networks, classifying cases when a best k-layer neural network approximation always exists or does not exist for the ReLU activation σ=(0,x). We also determine the dimensions of shallow ReLU-activated networks.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset