Assessing the accuracy of record linkages with Markov chain based Monte Carlo simulation approach

01/15/2019
by   Shovanur Haque, et al.
0

Record linkage is the process of finding matches and linking records from different data sources so that the linked records belong to the same entity. There is an increasing number of applications of record linkage in statistical, health, government and business organisations to link administrative, survey, population census and other files to create a complete set of information for more complete and comprehensive analysis. Despite this increase, there has been little work on developing tools to assess the quality of linked files. Ensuring that the matched records in the combined file actually correspond to the same individual or entity is crucial for the validity of any analyses and inferences based on the combined data. This paper proposes a Markov Chain based Monte Carlo simulation method for assessing the accuracy of a linked file and illustrates the utility of the approach using the ABS (Australian Bureau of Statistics) synthetic data in realistic data settings. In the linking process, different blocking strategies are considered to classify matches from non-matches with different levels of accuracy. To assess the average accuracy of linking, correctly linked proportions are investigated for each record. Test results show strong performance of the proposed method of assessment of accuracy of the linkages.

READ FULL TEXT
research
03/12/2020

MaCSim approach to assess the accuracy of individual matched records with varying block sizes and cut-off values

Record linkage is the process of matching together the records from diff...
research
03/12/2020

Assessing the accuracy of individual link with varying block sizes and cut-off values using MaCSim approach

Record linkage is the process of matching together the records from diff...
research
03/12/2020

Improved assessment of the accuracy of record linkage via an extended MaCSim approach

Record linkage is the process of bringing together the same entity from ...
research
03/12/2020

Extending the MaCSim approach using similarity weight matrix to assess the accuracy of record linkage

Record linkage is the process of bringing together the same entity from ...
research
08/04/2017

Exploiting Redundancy, Recurrence and Parallelism: How to Link Millions of Addresses with Ten Lines of Code in Ten Minutes

Accurate and efficient record linkage is an open challenge of particular...
research
02/21/2020

A Joint Bayesian Framework for Causal Inference and Bipartite Matching for Record Linkage

The recent proliferation in the use of digital health data has opened po...
research
12/01/2020

A Bayesian Approach to Linking Data Without Unique Identifiers

Existing file linkage methods may produce sub-optimal results because th...

Please sign up or login with your details

Forgot password? Click here to reset