DeepAI AI Chat
Log In Sign Up

When does loss-based prioritization fail?

by   Niel Teng Hu, et al.

Not all examples are created equal, but standard deep neural network training protocols treat each training point uniformly. Each example is propagated forward and backward through the network the same amount of times, independent of how much the example contributes to the learning protocol. Recent work has proposed ways to accelerate training by deviating from this uniform treatment. Popular methods entail up-weighting examples that contribute more to the loss with the intuition that examples with low loss have already been learned by the model, so their marginal value to the training procedure should be lower. This view assumes that updating the model with high loss examples will be beneficial to the model. However, this may not hold for noisy, real world data. In this paper, we theorize and then empirically demonstrate that loss-based acceleration methods degrade in scenarios with noisy and corrupted data. Our work suggests measures of example difficulty need to correctly separate out noise from other types of challenging examples.


Exponentiated Gradient Reweighting for Robust Training Under Label Noise and Beyond

Many learning tasks in machine learning can be viewed as taking a gradie...

BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining

Neural network robustness has become a central topic in machine learning...

Regularly Truncated M-estimators for Learning with Noisy Labels

The sample selection approach is very popular in learning with noisy lab...

Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

Human-annotated labels are often prone to noise, and the presence of suc...

Graph Learning with Loss-Guided Training

Classically, ML models trained with stochastic gradient descent (SGD) ar...

Accelerating Deep Learning by Focusing on the Biggest Losers

This paper introduces Selective-Backprop, a technique that accelerates t...