Accelerating Deep Learning with Dynamic Data Pruning

11/24/2021
by   Ravi S Raju, et al.
0

Deep learning's success has been attributed to the training of large, overparameterized models on massive amounts of data. As this trend continues, model training has become prohibitively costly, requiring access to powerful computing systems to train state-of-the-art networks. A large body of research has been devoted to addressing the cost per iteration of training through various model compression techniques like pruning and quantization. Less effort has been spent targeting the number of iterations. Previous work, such as forget scores and GraNd/EL2N scores, address this problem by identifying important samples within a full dataset and pruning the remaining samples, thereby reducing the iterations per epoch. Though these methods decrease the training time, they use expensive static scoring algorithms prior to training. When accounting for the scoring mechanism, the total run time is often increased. In this work, we address this shortcoming with dynamic data pruning algorithms. Surprisingly, we find that uniform random dynamic pruning can outperform the prior work at aggressive pruning rates. We attribute this to the existence of "sometimes" samples – points that are important to the learned decision boundary only some of the training time. To better exploit the subtlety of sometimes samples, we propose two algorithms, based on reinforcement learning techniques, to dynamically prune samples and achieve even higher accuracy than the random dynamic method. We test all our methods against a full-dataset baseline and the prior work on CIFAR-10 and CIFAR-100, and we can reduce the training time by up to 2x without significant performance loss. Our results suggest that data pruning should be understood as a dynamic process that is closely tied to a model's training trajectory, instead of a static step based solely on the dataset alone.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2019

CUP: Cluster Pruning for Compressing Deep Neural Networks

We propose Cluster Pruning (CUP) for compressing and accelerating deep n...
research
03/26/2023

Task-oriented Memory-efficient Pruning-Adapter

The Outstanding performance and growing size of Large Language Models ha...
research
03/08/2023

InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning

Data pruning aims to obtain lossless performances as training on the ori...
research
06/05/2023

NLU on Data Diets: Dynamic Data Subset Selection for NLP Classification Tasks

Finetuning large language models inflates the costs of NLU applications ...
research
02/17/2023

A New Baseline for GreenAI: Finding the Optimal Sub-Network via Layer and Channel Pruning

The concept of Green AI has been gaining attention within the deep learn...
research
05/17/2022

Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey

State-of-the-art deep learning models have a parameter count that reache...
research
09/21/2023

Cluster-based pruning techniques for audio data

Deep learning models have become widely adopted in various domains, but ...

Please sign up or login with your details

Forgot password? Click here to reset