BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning

11/19/2017
by   Ziming Zhang, et al.
0

Understanding the global optimality in deep learning (DL) has been attracting more and more attention recently. Conventional DL solvers, however, have not been developed intentionally to seek for such global optimality. In this paper we propose a novel approximation algorithm, BPGrad, towards optimizing deep models globally via branch and pruning. Our BPGrad algorithm is based on the assumption of Lipschitz continuity in DL, and as a result it can adaptively determine the step size for current gradient given the history of previous updates, wherein theoretically no smaller steps can achieve the global optimality. We prove that, by repeating such branch-and-pruning procedure, we can locate the global optimality within finite iterations. Empirically an efficient solver based on BPGrad for DL is proposed as well, and it outperforms conventional DL solvers such as Adagrad, Adadelta, RMSProp, and Adam in the tasks of object recognition, detection, and segmentation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2021

Branch-and-Pruning Optimization Towards Global Optimality in Deep Learning

It has been attracting more and more attention to understand the global ...
research
03/28/2018

What deep learning can tell us about higher cognitive functions like mindreading?

Can deep learning (DL) guide our understanding of computations happening...
research
02/26/2020

DLSpec: A Deep Learning Task Exchange Specification

Deep Learning (DL) innovations are being introduced at a rapid pace. How...
research
10/21/2019

A Complexity Efficient DMT-Optimal Tree Pruning Based Sphere Decoding

We present a diversity multiplexing tradeoff (DMT) optimal tree pruning ...
research
12/30/2021

A Survey of Deep Learning Techniques for Dynamic Branch Prediction

Branch prediction is an architectural feature that speeds up the executi...
research
04/24/2021

Improving the filtering of Branch-And-Bound MDD solver (extended)

This paper presents and evaluates two pruning techniques to reinforce th...
research
08/17/2021

Scaling Laws for Deep Learning

Running faster will only get you so far – it is generally advisable to f...

Please sign up or login with your details

Forgot password? Click here to reset