Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints

05/12/2019
by   Mengtian Li, et al.
6

In most practical settings and theoretical analysis, one assumes that a model can be trained until convergence. However, the growing complexity of machine learning datasets and models may violate such assumptions. Moreover, current approaches for hyper-parameter tuning and neural architecture search tend to be limited by practical resource constraints. Therefore, we introduce a formal setting for studying training under the non-asymptotic, resource-constrained regime, i.e. budgeted training. We analyze the following problem: "given a dataset, algorithm, and resource budget, what is the best achievable performance?" We focus on the number of optimization iterations as the representative resource. Under such a setting, we show that it is critical to adjust the learning rate schedule according to the given budget. Among budget-aware learning schedules, we find simple linear decay to be both robust and high-performing. We support our claim through extensive experiments with state-of-the-art models on ImageNet (image classification), Cityscapes (semantic segmentation), MS COCO (object detection and instance segmentation), and Kinetics (video classification). We also analyze our results and find that the key to a good schedule is budgeted convergence, a phenomenon whereby the gradient vanishes at the end of each allowed budget. We also revisit existing approaches for fast convergence, and show that budget-aware learning schedules readily outperform such approaches under (the practical but under-explored) budgeted setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2021

REX: Revisiting Budgeted Training with an Improved Schedule

Deep learning practitioners often operate on a computational and monetar...
research
10/30/2020

Resource-Aware Pareto-Optimal Automated Machine Learning Platform

In this study, we introduce a novel platform Resource-Aware AutoML (RA-A...
research
04/29/2019

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure

There is a stark disparity between the step size schedules used in pract...
research
07/13/2021

Automated Learning Rate Scheduler for Large-batch Training

Large-batch training has been essential in leveraging large-scale datase...
research
10/01/2018

Taming VAEs

In spite of remarkable progress in deep latent variable generative model...
research
03/30/2021

Exploiting Invariance in Training Deep Neural Networks

Inspired by two basic mechanisms in animal visual systems, we introduce ...
research
09/29/2020

Inverse Classification with Limited Budget and Maximum Number of Perturbed Samples

Most recent machine learning research focuses on developing new classifi...

Please sign up or login with your details

Forgot password? Click here to reset