Piecewise Strong Convexity of Neural Networks

10/30/2018
by   Tristan Milne, et al.
0

We study the loss surface of a fully connected neural network with ReLU non-linearities, regularized with weight decay. We start by expressing the output of the network as a matrix determinant, which allows us to establish that the loss function is piecewise strongly convex on a bounded set where the training set error is below a threshold that we can estimate. This is used to prove that local minima of the loss function in this open set are isolated, and that every critical point below this error threshold is a local minimum, partially addressing an open problem given at the Conference on Learning Theory (COLT) 2015. Our results also give quantitative understanding of the improved performance if dropout is used as well as quantitative evidence that deeper networks are harder to train.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro