A Proximal Block Coordinate Descent Algorithm for Deep Neural Network Training

03/24/2018
by   Tim Tsz-Kit Lau, et al.
0

Training deep neural networks (DNNs) efficiently is a challenge due to the associated highly nonconvex optimization. The backpropagation (backprop) algorithm has long been the most widely used algorithm for gradient computation of parameters of DNNs and is used along with gradient descent-type algorithms for this optimization task. Recent work have shown the efficiency of block coordinate descent (BCD) type methods empirically for training DNNs. In view of this, we propose a novel algorithm based on the BCD method for training DNNs and provide its global convergence results built upon the powerful framework of the Kurdyka-Lojasiewicz (KL) property. Numerical experiments on standard datasets demonstrate its competitive efficiency against standard optimizers with backprop.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/01/2018

Block Coordinate Descent for Deep Learning: Unified Convergence Guarantees

Training deep neural networks (DNNs) efficiently is a challenge due to t...
research
11/20/2017

Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

By lifting the ReLU function into a higher dimensional space, we develop...
research
08/30/2022

Convergence Rates of Training Deep Neural Networks via Alternating Minimization Methods

Training deep neural networks (DNNs) is an important and challenging opt...
research
05/12/2022

Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs

The optimization with orthogonality has been shown useful in training de...
research
03/18/2020

Block Layer Decomposition schemes for training Deep Neural Networks

Deep Feedforward Neural Networks' (DFNNs) weights estimation relies on t...
research
10/05/2016

A Novel Representation of Neural Networks

Deep Neural Networks (DNNs) have become very popular for prediction in m...
research
05/16/2016

Alternating optimization method based on nonnegative matrix factorizations for deep neural networks

The backpropagation algorithm for calculating gradients has been widely ...

Please sign up or login with your details

Forgot password? Click here to reset