Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization

06/15/2022
by   Jivat Neet Kaur, et al.
0

Real-world data collected from multiple domains can have multiple, distinct distribution shifts over multiple attributes. However, state-of-the art advances in domain generalization (DG) algorithms focus only on specific shifts over a single attribute. We introduce datasets with multi-attribute distribution shifts and find that existing DG algorithms fail to generalize. To explain this, we use causal graphs to characterize the different types of shifts based on the relationship between spurious attributes and the classification label. Each multi-attribute causal graph entails different constraints over observed variables, and therefore any algorithm based on a single, fixed independence constraint cannot work well across all shifts. We present Causally Adaptive Constraint Minimization (CACM), a new algorithm for identifying the correct independence constraints for regularization. Results on fully synthetic, MNIST and small NORB datasets, covering binary and multi-valued attributes and labels, confirm our theoretical claim: correct independence constraints lead to the highest accuracy on unseen domains whereas incorrect constraints fail to do so. Our results demonstrate the importance of modeling the causal relationships inherent in the data-generating process: in many cases, it is impossible to know the correct regularization constraints without this information.

READ FULL TEXT

page 2

page 16

research
02/10/2022

Learning Latent Causal Dynamics

One critical challenge of time-series modeling is how to learn and quick...
research
09/18/2023

Towards Effective Semantic OOD Detection in Unseen Domains: A Domain Generalization Perspective

Two prevalent types of distributional shifts in machine learning are the...
research
10/19/2022

"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts

Performance of machine learning models may differ between training and d...
research
02/11/2022

Invariance Principle Meets Out-of-Distribution Generalization on Graphs

Despite recent developments in using the invariance principle from causa...
research
05/13/2021

Causally-motivated Shortcut Removal Using Auxiliary Labels

Robustness to certain distribution shifts is a key requirement in many M...
research
12/22/2020

Can I Still Trust You?: Understanding the Impact of Distribution Shifts on Algorithmic Recourses

As predictive models are being increasingly deployed to make a variety o...
research
10/21/2020

Multi-Dimensional Randomized Response

In our data world, a host of not necessarily trusted controllers gather ...

Please sign up or login with your details

Forgot password? Click here to reset