Constructing synthetic populations in the age of big data

by   M. A. Nicolaie, et al.

To develop public health intervention models using microsimulations, extensive personal information about inhabitants is needed, such as socio-demographic, economic and health figures. Data confidentiality is an essential characteristic of such data, while the data should support realistic scenarios. Collection of such data is possible only in secured environments and not directly available for external micro-simulation models. The aim of this paper is to illustrate a method for construction of synthetic data by predicting individual features through models based on confidential data on health and socio-economic determinants of the entire Dutch population.


