Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with ℓ_1 and ℓ_2 Regularization

11/19/2017
by   Zhifeng Kong, et al.
0

In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one ReLU output. We took into consideration two popular regularization terms: the ℓ_1 and ℓ_2 norm of the parameter vector w, and added it to the square loss function with coefficient λ/2. We proved that when λ is small, the weight vector w converges to the optimal solution ŵ (with respect to the new loss function) with probability ≥ (1-ε)(1-A_d)/2 under random initiations in a sphere centered at the origin, where ε is a small value and A_d is a constant. Numerical experiments including phase diagrams and repeated simulations verified our theory.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset