A Joint Bayesian Framework for Causal Inference and Bipartite Matching for Record Linkage

02/21/2020
by   Sharmistha Guha, et al.
0

The recent proliferation in the use of digital health data has opened possibilities for gathering information on a common set of entities from various government and non-government sources and make causal inferences about important health outcomes. In such scenarios, the response may be obtained from a source different than the one from which the treatment assignment and covariates are obtained. In absence of error free direct identifiers (e.g., SSN), straightforward merging of separate files based on these identifiers is not feasible, giving rise to need for matching on imperfect linking variables (e.g., names, birth years). Causal inference in such situations generally follows using a two-stage procedure, wherein the first stage involves linking two files using a probabilistic linkage technique with imperfect linking variables common to both files, followed by causal inference on the linked dataset in the second stage. Rather than sequentially performing record linkage and causal inference, this article proposes a novel framework for simultaneous Bayesian inference on probabilistic linkage and the causal effect. In contrast with the two-stage approach, our proposed methodology facilitates borrowing of information between the models employed for causal inference and record linkage, thus improving accuracy of inference in both models. Importantly, the joint modeling framework offers characterization of uncertainty, both in causal inference and in record linkage. An efficient computational template using Markov chain Monte Carlo (MCMC) is developed for the joint model. Simulation studies and real data analysis provide evidence of both improved accuracy in estimates of treatment effects, as well as more accurate linking of two files in the joint modeling framework over the two-stage modeling option. The conclusion is further buttressed by theoretical insights presented in this article.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2023

Bayesian Record Linkage with Variables in One File

In many healthcare and social science applications, information about un...
research
08/04/2017

Exploiting Redundancy, Recurrence and Parallelism: How to Link Millions of Addresses with Ten Lines of Code in Ten Minutes

Accurate and efficient record linkage is an open challenge of particular...
research
01/15/2019

Assessing the accuracy of record linkages with Markov chain based Monte Carlo simulation approach

Record linkage is the process of finding matches and linking records fro...
research
03/09/2020

Fast Bayesian Record Linkage With Record-Specific Disagreement Parameters

Applied researchers are often interested in linking individuals between ...
research
03/12/2020

Improved assessment of the accuracy of record linkage via an extended MaCSim approach

Record linkage is the process of bringing together the same entity from ...
research
09/24/2020

Latent Causal Socioeconomic Health Index

This research develops a model-based LAtent Causal Socioeconomic Health ...
research
03/12/2020

Extending the MaCSim approach using similarity weight matrix to assess the accuracy of record linkage

Record linkage is the process of bringing together the same entity from ...

Please sign up or login with your details

Forgot password? Click here to reset