Mitigating Class Boundary Label Uncertainty to Reduce Both Model Bias and Variance

02/23/2020
by   Matthew Almeida, et al.
0

The study of model bias and variance with respect to decision boundaries is critically important in supervised classification. There is generally a tradeoff between the two, as fine-tuning of the decision boundary of a classification model to accommodate more boundary training samples (i.e., higher model complexity) may improve training accuracy (i.e., lower bias) but hurt generalization against unseen data (i.e., higher variance). By focusing on just classification boundary fine-tuning and model complexity, it is difficult to reduce both bias and variance. To overcome this dilemma, we take a different perspective and investigate a new approach to handle inaccuracy and uncertainty in the training data labels, which are inevitable in many applications where labels are conceptual and labeling is performed by human annotators. The process of classification can be undermined by uncertainty in the labels of the training data; extending a boundary to accommodate an inaccurately labeled point will increase both bias and variance. Our novel method can reduce both bias and variance by estimating the pointwise label uncertainty of the training set and accordingly adjusting the training sample weights such that those samples with high uncertainty are weighted down and those with low uncertainty are weighted up. In this way, uncertain samples have a smaller contribution to the objective function of the model's learning algorithm and exert less pull on the decision boundary. In a real-world physical activity recognition case study, the data presents many labeling challenges, and we show that this new approach improves model performance and reduces model variance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2021

Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective

Knowledge distillation is an effective approach to leverage a well-train...
research
11/10/2022

On the Ramifications of Human Label Uncertainty

Humans exhibit disagreement during data labeling. We term this disagreem...
research
03/28/2023

That Label's Got Style: Handling Label Style Bias for Uncertain Image Segmentation

Segmentation uncertainty models predict a distribution over plausible se...
research
07/16/2019

Improving Bayesian Local Spatial Models in Large Data Sets

Environmental processes resolved at a sufficiently small scale in space ...
research
02/08/2022

Understanding the bias-variance tradeoff of Bregman divergences

This paper builds upon the work of Pfau (2013), which generalized the bi...
research
03/13/2022

Worst Case Matters for Few-Shot Recognition

Few-shot recognition learns a recognition model with very few (e.g., 1 o...
research
07/19/2021

CHEF: A Cheap and Fast Pipeline for Iteratively Cleaning Label Uncertainties (Technical Report)

High-quality labels are expensive to obtain for many machine learning ta...

Please sign up or login with your details

Forgot password? Click here to reset