General Greedy De-bias Learning

by   Xinzhe Han, et al.

Neural networks often make predictions relying on the spurious correlations from the datasets rather than the intrinsic properties of the task of interest, facing sharp degradation on out-of-distribution (OOD) test data. Existing de-bias learning frameworks try to capture specific dataset bias by bias annotations, they fail to handle complicated OOD scenarios. Others implicitly identify the dataset bias by the special design on the low capability biased model or the loss, but they degrade when the training and testing data are from the same distribution. In this paper, we propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. It encourages the base model to focus on examples that are hard to solve with biased models, thus remaining robust against spurious correlations in the test stage. GGD largely improves models' OOD generalization ability on various tasks, but sometimes over-estimates the bias level and degrades on the in-distribution test. We further re-analyze the ensemble process of GGD and introduce the Curriculum Regularization into GGD inspired by curriculum learning, which achieves a good trade-off between in-distribution and out-of-distribution performance. Extensive experiments on image classification, adversarial question answering, and visual question answering demonstrate the effectiveness of our method. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.


page 2

page 6

page 8

page 12

page 13

page 20


Greedy Gradient Ensemble for Robust Visual Question Answering

Language bias is a critical issue in Visual Question Answering (VQA), wh...

Look to the Right: Mitigating Relative Position Bias in Extractive Question Answering

Extractive question answering (QA) models tend to exploit spurious corre...

Generative Bias for Visual Question Answering

The task of Visual Question Answering (VQA) is known to be plagued by th...

Can Subnetwork Structure be the Key to Out-of-Distribution Generalization?

Can models with particular structure avoid being biased towards spurious...

Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles

Many datasets have been shown to contain incidental correlations created...

Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding

Dataset bias has attracted increasing attention recently for its detrime...

Learning Robust Representation for Joint Grading of Ophthalmic Diseases via Adaptive Curriculum and Feature Disentanglement

Diabetic retinopathy (DR) and diabetic macular edema (DME) are leading c...

Please sign up or login with your details

Forgot password? Click here to reset