General Greedy De-bias Learning

by   Xinzhe Han, et al.

Neural networks often make predictions relying on the spurious correlations from the datasets rather than the intrinsic properties of the task of interest, facing sharp degradation on out-of-distribution (OOD) test data. Existing de-bias learning frameworks try to capture specific dataset bias by bias annotations, they fail to handle complicated OOD scenarios. Others implicitly identify the dataset bias by the special design on the low capability biased model or the loss, but they degrade when the training and testing data are from the same distribution. In this paper, we propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. It encourages the base model to focus on examples that are hard to solve with biased models, thus remaining robust against spurious correlations in the test stage. GGD largely improves models' OOD generalization ability on various tasks, but sometimes over-estimates the bias level and degrades on the in-distribution test. We further re-analyze the ensemble process of GGD and introduce the Curriculum Regularization into GGD inspired by curriculum learning, which achieves a good trade-off between in-distribution and out-of-distribution performance. Extensive experiments on image classification, adversarial question answering, and visual question answering demonstrate the effectiveness of our method. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.


page 2

page 6

page 8

page 12

page 13

page 20


Greedy Gradient Ensemble for Robust Visual Question Answering

Language bias is a critical issue in Visual Question Answering (VQA), wh...

Generative Bias for Visual Question Answering

The task of Visual Question Answering (VQA) is known to be plagued by th...

Can Subnetwork Structure be the Key to Out-of-Distribution Generalization?

Can models with particular structure avoid being biased towards spurious...

Don't Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset Biases

State-of-the-art models often make use of superficial patterns in the da...

Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles

Many datasets have been shown to contain incidental correlations created...

Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation

Many recent works indicate that the deep neural networks tend to take da...

Learning Robust Representation for Joint Grading of Ophthalmic Diseases via Adaptive Curriculum and Feature Disentanglement

Diabetic retinopathy (DR) and diabetic macular edema (DME) are leading c...

Code Repositories


Code release for General Greedy De-bias Learning

view repo