DeepAI AI Chat
Log In Sign Up

MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning

01/30/2023
by   Kirshnateja Killamsetty, et al.
ibm
The University of Texas at Dallas
0

Training deep networks and tuning hyperparameters on large datasets is computationally intensive. One of the primary research directions for efficient training is to reduce training costs by selecting well-generalizable subsets of training data. Compared to simple adaptive random subset selection baselines, existing intelligent subset selection approaches are not competitive due to the time-consuming subset selection step, which involves computing model-dependent gradients and feature embeddings and applies greedy maximization of submodular objectives. Our key insight is that removing the reliance on downstream model parameters enables subset selection as a pre-processing step and enables one to train multiple models at no additional cost. In this work, we propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training while enabling superior model convergence and performance by using an easy-to-hard curriculum. Our empirical results indicate that MILO can train models 3× - 10 × faster and tune hyperparameters 20× - 75 × faster than full-dataset training or tuning without compromising performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/15/2022

AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

Deep neural networks have seen great success in recent years; however, t...
05/28/2023

Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning

Methods for carefully selecting or generating a small set of training da...
06/20/2023

GIO: Gradient Information Optimization for Training Dataset Selection

It is often advantageous to train models on a subset of the available tr...
02/22/2022

Submodlib: A Submodular Optimization Library

Submodular functions are a special class of set functions which naturall...
12/10/2019

What is the best predictor that you can compute in five minutes using a given Bayesian hierarchical model?

The goal of this paper is to provide a way for statisticians to answer t...
04/28/2021

Finding High-Value Training Data Subset through Differentiable Convex Programming

Finding valuable training data points for deep neural networks has been ...
04/26/2021

Balancing Constraints and Submodularity in Data Subset Selection

Deep learning has yielded extraordinary results in vision and natural la...