Revisiting Identifying Assumptions for Population Size Estimation

01/22/2021
by   Serge Aleshin-Guendel, et al.
0

The problem of estimating the size of a population based on a subset of individuals observed across multiple data sources is often referred to as capture-recapture or multiple-systems estimation. This is fundamentally a missing data problem, where the number of unobserved individuals represents the missing data. As with any missing data problem, multiple-systems estimation requires users to make an untestable identifying assumption in order to estimate the population size from the observed data. Approaches to multiple-systems estimation often do not emphasize the role of the identifying assumption during model specification, which makes it difficult to decouple the specification of the model for the observed data from the identifying assumption. We present a re-framing of the multiple-systems estimation problem that decouples the specification of the observed-data model from the identifying assumptions, and discuss how log-linear models and the associated no-highest-order interaction assumption fit into this framing. We present an approach to computation in the Bayesian setting which takes advantage of existing software and facilitates various sensitivity analyses. We demonstrate our approach in a case study of estimating the number of civilian casualties in the Kosovo war. Code used to produce this manuscript is available at https://github.com/aleshing/revisiting-identifying-assumptions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2017

Diagnosing missing always at random in multivariate data

Models for analyzing multivariate data sets with missing values require ...
research
11/15/2016

Recoverability of Joint Distribution from Missing Data

A probabilistic query may not be estimable from observed data corrupted ...
research
05/12/2021

Estimation of population size based on capture recapture designs and evaluation of the estimation reliability

We propose a modern method to estimate population size based on capture-...
research
11/30/2020

Data Fusion for Joining Income and Consumption Information Using Different Donor-Recipient Distance Metrics

Data fusion describes the method of combining data from (at least) two i...
research
02/28/2019

Deductive semiparametric estimation in Double-Sampling Designs with application to PEPFAR

Robust estimators in missing data problems often use semiparametric esti...
research
12/01/2021

Learning Invariant Representations with Missing Data

Spurious correlations allow flexible models to predict well during train...
research
10/15/2022

Fisher's Noncentral Hypergeometric Distribution for Population Size Estimation

We introduce a method to make inference on the subgroups' sizes of a het...

Please sign up or login with your details

Forgot password? Click here to reset