Faster Convergence & Generalization in DNNs

07/30/2018
by   Gaurav Singh, et al.
4

Deep neural networks have gained tremendous popularity in last few years. They have been applied for the task of classification in almost every domain. Despite the success, deep networks can be incredibly slow to train for even moderate sized models on sufficiently large datasets. Additionally, these networks require large amounts of data to be able to generalize. The importance of speeding up convergence, and generalization in deep networks can not be overstated. In this work, we develop an optimization algorithm based on generalized-optimal updates derived from minibatches that lead to faster convergence. Towards the end, we demonstrate on two benchmark datasets that the proposed method achieves two orders of magnitude speed up over traditional back-propagation, and is more robust to noise/over-fitting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2022

Deep Learning Generalization, Extrapolation, and Over-parameterization

We study the generalization of over-parameterized deep networks (for ima...
research
02/20/2020

An Elementary Approach to Convergence Guarantees of Optimization Algorithms for Deep Networks

We present an approach to obtain convergence guarantees of optimization ...
research
06/16/2017

A Closer Look at Memorization in Deep Networks

We examine the role of memorization in deep learning, drawing connection...
research
08/22/2019

Automated Architecture Design for Deep Neural Networks

Machine learning has made tremendous progress in recent years and receiv...
research
06/03/2018

Minnorm training: an algorithm for training overcomplete deep neural networks

In this work, we propose a new training method for finding minimum weigh...
research
06/03/2018

Minnorm training: an algorithm for training over-parameterized deep neural networks

In this work, we propose a new training method for finding minimum weigh...
research
01/09/2019

Generalized Deduplication: Bounds, Convergence, and Asymptotic Properties

We study a generalization of deduplication, which enables lossless dedup...

Please sign up or login with your details

Forgot password? Click here to reset