Distributed Design for Causal Inferences on Big Observational Data

03/15/2022
by   Yumin Zhang, et al.
0

A fundamental issue in causal inference for Big Observational Data is confounding due to covariate imbalances between treatment groups. This can be addressed by designing the data prior to analysis. Existing design methods, developed for traditional observational studies with single designers, can yield unsatisfactory designs with suboptimum covariate balance for Big Observational Data due to their inability to accommodate the massive dimensionality, heterogeneity, and volume of the Big Data. We propose a new framework for the distributed design of Big Observational Data amongst collaborative designers. Our framework first assigns subsets of the high-dimensional and heterogeneous covariates to multiple designers. The designers then summarize their covariates into lower-dimensional quantities, share their summaries with the others, and design the study in parallel based on their assigned covariates and the summaries they receive. The final design is selected by comparing balance measures for all covariates across the candidates. We perform simulation studies and analyze datasets from the 2016 Atlantic Causal Inference Conference Data Challenge to demonstrate the flexibility and power of our framework for constructing designs with good covariate balance from Big Observational Data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/02/2018

Relaxed covariate overlap and margin-based causal effect estimation

In most nonrandomized observational studies, differences between treatme...
research
01/09/2018

A note on strict functional covariate overlap in causal inference problems with high-dimensional covariates

A powerful tool for the analysis of nonrandomized observational studies ...
research
12/23/2022

Balanced Subsampling for Big Data with Categorical Covariates

The use and analysis of massive data are challenging due to the high sto...
research
02/15/2018

DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training

We study optimal covariate balance for causal inferences from observatio...
research
11/07/2017

Overlap in Observational Studies with High-Dimensional Covariates

Causal inference in observational settings typically rests on a pair of ...
research
07/05/2018

The Role of the Propensity Score in Fixed Effect Models

We develop a new approach for estimating average treatment effects in th...
research
09/05/2023

Observational Causality Testing

In prior work we have introduced an asymptotic threshold of sufficient r...

Please sign up or login with your details

Forgot password? Click here to reset