Supercharging Imbalanced Data Learning With Causal Representation Transfer

11/25/2020
by   Junya Chen, et al.
0

Dealing with severe class imbalance poses a major challenge for real-world applications, especially when the accurate classification and generalization of minority classes is of primary interest. In computer vision, learning from long tailed datasets is a recurring theme, especially for natural image datasets. While existing solutions mostly appeal to sampling or weighting adjustments to alleviate the pathological imbalance, or imposing inductive bias to prioritize non-spurious associations, we take novel perspectives to promote sample efficiency and model generalization based on the invariance principles of causality. Our proposal posits a meta-distributional scenario, where the data generating mechanism is invariant across the label-conditional feature distributions. Such causal assumption enables efficient knowledge transfer from the dominant classes to their under-represented counterparts, even if the respective feature distributions show apparent disparities. This allows us to leverage a causal data inflation procedure to enlarge the representation of minority classes. Our development is orthogonal to the existing extreme classification techniques thus can be seamlessly integrated. The utility of our proposal is validated with an extensive set of synthetic and real-world computer vision tasks against SOTA solutions.

READ FULL TEXT
research
03/22/2022

Out-of-distribution Generalization with Causal Invariant Transformations

In real-world applications, it is important and desirable to learn a mod...
research
10/25/2022

Multi-Domain Long-Tailed Learning by Augmenting Disentangled Representations

There is an inescapable long-tailed class-imbalance issue in many real-w...
research
06/18/2019

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Deep learning algorithms can fare poorly when the training dataset suffe...
research
08/23/2022

Self-Trained Proposal Networks for the Open World

Deep learning-based object proposal methods have enabled significant adv...
research
11/29/2022

PatchMix Augmentation to Identify Causal Features in Few-shot Learning

The task of Few-shot learning (FSL) aims to transfer the knowledge learn...
research
11/20/2020

Sequential Targeting: an incremental learning approach for data imbalance in text classification

Classification tasks require a balanced distribution of data to ensure t...
research
09/30/2022

IMB-NAS: Neural Architecture Search for Imbalanced Datasets

Class imbalance is a ubiquitous phenomenon occurring in real world data ...

Please sign up or login with your details

Forgot password? Click here to reset