AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

03/15/2022
by   KrishnaTeja Killamsetty, et al.
0

Deep neural networks have seen great success in recent years; however, training a deep model is often challenging as its performance heavily depends on the hyper-parameters used. In addition, finding the optimal hyper-parameter configuration, even with state-of-the-art (SOTA) hyper-parameter optimization (HPO) algorithms, can be time-consuming, requiring multiple training runs over the entire dataset for different possible sets of hyper-parameters. Our central insight is that using an informative subset of the dataset for model training runs involved in hyper-parameter optimization, allows us to find the optimal hyper-parameter configuration significantly faster. In this work, we propose AUTOMATA, a gradient-based subset selection framework for hyper-parameter tuning. We empirically evaluate the effectiveness of AUTOMATA in hyper-parameter tuning through several experiments on real-world datasets in the text, vision, and tabular domains. Our experiments show that using gradient-based data subsets for hyper-parameter tuning achieves significantly faster turnaround times and speedups of 3×-30× while achieving comparable performance to the hyper-parameters found using the entire dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2020

On Hyper-parameter Tuning for Stochastic Optimization Algorithms

This paper proposes the first-ever algorithmic framework for tuning hype...
research
01/30/2023

MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning

Training deep networks and tuning hyperparameters on large datasets is c...
research
08/04/2023

On stable wrapper-based parameter selection method for efficient ANN-based data-driven modeling of turbulent flows

To model complex turbulent flow and heat transfer phenomena, this study ...
research
09/10/2022

Simple and Effective Gradient-Based Tuning of Sequence-to-Sequence Models

Recent trends towards training ever-larger language models have substant...
research
06/24/2012

Practical recommendations for gradient-based training of deep architectures

Learning algorithms related to artificial neural networks and in particu...
research
03/13/2014

The Potential Benefits of Filtering Versus Hyper-Parameter Optimization

The quality of an induced model by a learning algorithm is dependent on ...
research
12/27/2021

Automatic Configuration for Optimal Communication Scheduling in DNN Training

ByteScheduler partitions and rearranges tensor transmissions to improve ...

Please sign up or login with your details

Forgot password? Click here to reset