Cockpit: A Practical Debugging Tool for Training Deep Neural Networks

02/12/2021
by   Frank Schneider, et al.
76

When engineers train deep learning models, they are very much "flying blind". Commonly used approaches for real-time training diagnostics, such as monitoring the train/test loss, are limited. Assessing a network's training process solely through these performance indicators is akin to debugging software without access to internal states through a debugger. To address this, we present Cockpit, a collection of instruments that enable a closer look into the inner workings of a learning machine, and a more informative and meaningful status report for practitioners. It facilitates the identification of learning phases and failure modes, like ill-chosen hyperparameters. These instruments leverage novel higher-order information about the gradient distribution and curvature, which has only recently become efficiently accessible. We believe that such a debugging tool, which we open-source for PyTorch, represents an important step to improve troubleshooting the training process, reveal new insights, and help develop novel methods and heuristics.

READ FULL TEXT

page 3

page 5

page 7

page 23

page 24

page 25

page 27

research
06/16/2020

Gradient Amplification: An efficient way to train deep neural networks

Improving performance of deep learning models and reducing their trainin...
research
09/23/2020

ANNdotNET – deep learning tool on .NET Platform

ANNdotNET is an open source project for deep learning written in C# with...
research
11/05/2020

Teaching with Commentaries

Effective training of deep neural networks can be challenging, and there...
research
07/08/2020

Distributed Training of Deep Learning Models: A Taxonomic Perspective

Distributed deep learning systems (DDLS) train deep neural network model...
research
03/04/2021

Clusterability in Neural Networks

The learned weights of a neural network have often been considered devoi...
research
09/03/2019

LCA: Loss Change Allocation for Neural Network Training

Neural networks enjoy widespread use, but many aspects of their training...

Please sign up or login with your details

Forgot password? Click here to reset