AutoBalance: Optimized Loss Functions for Imbalanced Data

01/04/2022
by   Mingchen Li, et al.
0

Imbalanced datasets are commonplace in modern machine learning problems. The presence of under-represented classes or groups with sensitive attributes results in concerns about generalization and fairness. Such concerns are further exacerbated by the fact that large capacity deep nets can perfectly fit the training data and appear to achieve perfect accuracy and fairness during training, but perform poorly during test. To address these challenges, we propose AutoBalance, a bi-level optimization framework that automatically designs a training loss function to optimize a blend of accuracy and fairness-seeking objectives. Specifically, a lower-level problem trains the model weights, and an upper-level problem tunes the loss function by monitoring and optimizing the desired objective over the validation data. Our loss design enables personalized treatment for classes/groups by employing a parametric cross-entropy loss and individualized data augmentation schemes. We evaluate the benefits and performance of our approach for the application scenarios of imbalanced and group-sensitive classification. Extensive empirical evaluations demonstrate the benefits of AutoBalance over state-of-the-art approaches. Our experimental findings are complemented with theoretical insights on loss function design and the benefits of train-validation split. All code is available open-source.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/03/2022

FORML: Learning to Reweight Data for Fairness

Deployed machine learning models are evaluated by multiple metrics beyon...
research
08/21/2018

Wrapped Loss Function for Regularizing Nonconforming Residual Distributions

Multi-output is essential in machine learning that it might suffer from ...
research
01/30/2023

Online Loss Function Learning

Loss function learning is a new meta-learning paradigm that aims to auto...
research
01/03/2020

The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of Mislabeling

In this paper, we propose a new metric to measure goodness-of-fit for cl...
research
07/13/2020

A Machine Learning Approach to Assess Student Group Collaboration Using Individual Level Behavioral Cues

K-12 classrooms consistently integrate collaboration as part of their le...
research
04/10/2022

Real order total variation with applications to the loss functions in learning schemes

Loss function are an essential part in modern data-driven approach, such...
research
03/14/2023

On the Implicit Geometry of Cross-Entropy Parameterizations for Label-Imbalanced Data

Various logit-adjusted parameterizations of the cross-entropy (CE) loss ...

Please sign up or login with your details

Forgot password? Click here to reset