The Loss Surface of XOR Artificial Neural Networks

04/06/2018
by   Dhagash Mehta, et al.
0

Training an artificial neural network involves an optimization process over the landscape defined by the cost (loss) as a function of the network parameters. We explore these landscapes using optimisation tools developed for potential energy landscapes in molecular science. The number of local minima and transition states (saddle points of index one), as well as the ratio of transition states to minima, grow rapidly with the number of nodes in the network. There is also a strong dependence on the regularisation parameter, with the landscape becoming more convex (fewer minima) as the regularisation term increases. We demonstrate that in our formulation, stationary points for networks with N_h hidden nodes, including the minimal network required to fit the XOR data, are also stationary points for networks with N_h +1 hidden nodes when all the weights involving the additional nodes are zero. Hence, smaller networks optimized to train the XOR data are embedded in the landscapes of larger networks. Our results clarify certain aspects of the classification and sensitivity (to perturbations in the input data) of minima and saddle points for this system, and may provide insight into dropout and network compression.

READ FULL TEXT
research
04/26/2018

The loss landscape of overparameterized neural networks

We explore some mathematical features of the loss landscape of overparam...
research
04/24/2020

Nonconvex penalization for sparse neural networks

Training methods for artificial neural networks often rely on over-param...
research
04/08/2021

Numerics and analysis of Cahn–Hilliard critical points

We explore recent progress and open questions concerning local minima an...
research
04/26/2017

The loss surface of deep and wide neural networks

While the optimization problem behind deep neural networks is highly non...
research
01/08/2019

Visualising Basins of Attraction for the Cross-Entropy and the Squared Error Neural Network Loss Functions

Quantification of the stationary points and the associated basins of att...
research
10/27/2018

Towards Robust Deep Neural Networks

We examine the relationship between the energy landscape of neural netwo...
research
03/28/2017

Theory II: Landscape of the Empirical Risk in Deep Learning

Previous theoretical work on deep learning and neural network optimizati...

Please sign up or login with your details

Forgot password? Click here to reset