Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks

11/20/2017
by   Kensuke Nakamura, et al.
0

We present a stochastic first-order optimization algorithm, named BCSC, that adds a cyclic constraint to stochastic block-coordinate descent. It uses different subsets of the data to update different subsets of the parameters, thus limiting the detrimental effect of outliers in the training set. Empirical tests in benchmark datasets show that our algorithm outperforms state-of-the-art optimization methods in both accuracy as well as convergence speed. The improvements are consistent across different architectures, and can be combined with other training techniques and regularization methods.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset