Generalization and Invariances in the Presence of Unobserved Confounding
The ability to extrapolate, or generalize, from observed to new related environments is central to any form of reliable machine learning, yet most methods fail when moving beyond i.i.d data. In some cases, the reason lies in a misappreciation of the causal structure that governs the observed data. But, in others, it is unobserved data, such as hidden confounders, that drive changes in observed distributions and distort observed correlations. In this paper, we argue that generalization must be defined with respect to a broader class of distribution shifts, irrespective of their origin (arising from changes in observed, unobserved or target variables). We propose a new learning principle from which we may expect an explicit notion of generalization to certain new environments, even in the presence of hidden confounding. This principle leads us to formulate a general objective that may be paired with any gradient-based learning algorithm; algorithms that have a causal interpretation in some cases and enjoy notions of predictive stability in others. We demonstrate the empirical performance of our approach on healthcare data from different modalities, including image and speech data.
READ FULL TEXT