Not to Overfit or Underfit? A Study of Domain Generalization in Question Answering

05/15/2022
by   Md Arafat Sultan, et al.
0

Machine learning models are prone to overfitting their source (training) distributions, which is commonly believed to be why they falter in novel target domains. Here we examine the contrasting view that multi-source domain generalization (DG) is in fact a problem of mitigating source domain underfitting: models not adequately learning the signal in their multi-domain training data. Experiments on a reading comprehension DG benchmark show that as a model gradually learns its source domains better – using known methods such as knowledge distillation from a larger model – its zero-shot out-of-domain accuracy improves at an even faster rate. Improved source domain learning also demonstrates superior generalization over three popular domain-invariant learning methods that aim to counter overfitting.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset