Linking Administrative Data: An Evolutionary Schema

12/20/2017
by   Jack Lothian, et al.
0

Statistics New Zealand (Stats NZ) has committed unreservedly to an administrative data first policy. Thus, all new methods used at Stats NZ are to be viewed within this context and discussing strategies for using administrative data is an integral part of every working day. As statistical methodologists, the three authors were drawn into these discussions. Like most methodologists, the authors see surveys and the publications of their results as a process where estimation is the key tool to achieve the final goal of an accurate statistical output. Randomness and sampling exists to support this goal, and early on it was clear to us that the incoming it-is-what-it-is data sources were not randomly selected. These sources were obviously biased and thus would produce biased estimates. So, we set out to design a strategy to deal with this issue. This led us to the concept of representativeness which is closely related to statistical bias but has a wider context invoking both randomness and judgement. The representativeness issue was the principal question that we set out to answer. The necessary components that we gathered for our solution are summarized in the paper. Keywords: Representativeness, Timeline Databases, Statistical Registers, Estimation

READ FULL TEXT

page 18

page 19

page 20

page 21

page 24

page 32

page 33

page 36

research
08/28/2021

A robust fusion-extraction procedure with summary statistics in the presence of biased sources

Information from various data sources is increasingly available nowadays...
research
07/27/2022

Challenges and Opportunities of Computational Social Science for Official Statistics

The vast amount of data produced everyday (so-called 'digital traces') a...
research
02/09/2020

Random family method: Confirming inter-generational relations by restricted re-sampling

Randomness is one of the important key concepts of statistics. In epidem...
research
09/22/2022

Linking Contexts from Distinct Data Sources in Zero Trust Federation

An access control model called Zero Trust Architecture (ZTA) has attract...
research
06/07/2023

Changing Data Sources in the Age of Machine Learning for Official Statistics

Data science has become increasingly essential for the production of off...
research
10/11/2020

On Spatial Lag Models estimated using crowdsourcing, web-scraping or other unconventionally collected data

The Big Data revolution is challenging the state-of-the-art statistical ...
research
07/25/2023

A Primer on the Data Cleaning Pipeline

The availability of both structured and unstructured databases, such as ...

Please sign up or login with your details

Forgot password? Click here to reset