Global optimality conditions for deep neural networks
We study the error landscape of deep linear and nonlinear neural networks with square error loss. We build on the recent results in the literature and present necessary and sufficient conditions for a critical point of the empirical risk function to be a global minimum in the deep linear network case. Our simple conditions can also be used to determine whether a given critical point is a global minimum or a saddle point. We further extend these results to deep nonlinear neural networks and prove similar sufficient conditions for global optimality in the function space.
READ FULL TEXT