Causal Discovery with Unobserved Variables: A Proxy Variable Approach
Discovering causal relations from observational data is important. The existence of unobserved variables, such as latent confounders or mediators, can mislead the causal identification. To address this issue, proximal causal discovery methods proposed to adjust for the bias with the proxy of the unobserved variable. However, these methods presumed the data is discrete, which limits their real-world application. In this paper, we propose a proximal causal discovery method that can well handle the continuous variables. Our observation is that discretizing continuous variables can can lead to serious errors and comprise the power of the proxy. Therefore, to use proxy variables in the continuous case, the critical point is to control the discretization error. To this end, we identify mild regularity conditions on the conditional distributions, enabling us to control the discretization error to an infinitesimal level, as long as the proxy is discretized with sufficiently fine, finite bins. Based on this, we design a proxy-based hypothesis test for identifying causal relationships when unobserved variables are present. Our test is consistent, meaning it has ideal power when large samples are available. We demonstrate the effectiveness of our method using synthetic and real-world data.
READ FULL TEXT