DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size

06/14/2016
by   Maya Kabkab, et al.
0

Large-scale supervised classification algorithms, especially those based on deep convolutional neural networks (DCNNs), require vast amounts of training data to achieve state-of-the-art performance. Decreasing this data requirement would significantly speed up the training process and possibly improve generalization. Motivated by this objective, we consider the task of adaptively finding concise training subsets which will be iteratively presented to the learner. We use convex optimization methods, based on an objective criterion and feedback from the current performance of the classifier, to efficiently identify informative samples to train on. We propose an algorithm to decompose the optimization problem into smaller per-class problems, which can be solved in parallel. We test our approach on standard classification tasks and demonstrate its effectiveness in decreasing the training set size without compromising performance. We also show that our approach can make the classifier more robust in the presence of label noise and class imbalance.

READ FULL TEXT

page 6

page 8

research
12/13/2018

Training Set Camouflage

We introduce a form of steganography in the domain of machine learning w...
research
06/01/2019

Robust Learning Under Label Noise With Iterative Noise-Filtering

We consider the problem of training a model under the presence of label ...
research
06/19/2023

AdaSelection: Accelerating Deep Learning Training through Data Subsampling

In this paper, we introduce AdaSelection, an adaptive sub-sampling metho...
research
02/17/2020

Subset Sampling For Progressive Neural Network Learning

Progressive Neural Network Learning is a class of algorithms that increm...
research
03/28/2023

Data Efficient Contrastive Learning in Histopatholgy using Active Sampling

Deep Learning based diagnostics systems can provide accurate and robust ...
research
11/18/2018

DeepConsensus: using the consensus of features from multiple layers to attain robust image classification

We consider a classifier whose test set is exposed to various perturbati...
research
01/24/2018

Training Set Debugging Using Trusted Items

Training set bugs are flaws in the data that adversely affect machine le...

Please sign up or login with your details

Forgot password? Click here to reset