Deep Learning via Neural Energy Descent
This paper proposes the Nerual Energy Descent (NED) via neural network evolution equations for a wide class of deep learning problems. We show that deep learning can be reformulated as the evolution of network parameters in an evolution equation and the steady state solution of the partial differential equation (PDE) provides a solution to deep learning. This equation corresponds to a gradient descent flow of a variational problem and hence the proposed time-dependent PDE solves an energy minimization problem to obtain a global minimizer of deep learning. This gives a novel interpretation and solution to deep learning optimization. The computational complexity of the proposed energy descent method can be enhanced by randomly sampling the spatial domain of the PDE leading to an efficient NED. Numerical examples are provided to demonstrate the numerical advantage of NED over stochastic gradient descent (SGD).
READ FULL TEXT