Simulation Framework for Realistic Large-scale Individual-level Health Data Generation

08/31/2020
by   Santtu Tikka, et al.
0

We propose a general framework for realistic data generation and simulation of complex systems in the health domain. The main use cases of the framework are predicting the development of risk factors and disease occurrence, evaluating the impact of interventions and policy decisions, and statistical method development. We present the fundamentals of the framework using rigorous mathematical definitions. The framework supports calibration to a real population as well as various manipulations and data collection processes. The freely available open-source implementation in R embraces efficient data structures, parallel computing and fast random number generation which ensure reproducibility and scalability. With the framework it is possible to run daily-level simulations for populations of millions individuals for decades of simulated time. An example on the occurrence of stroke, type 2 diabetes and mortality illustrates the usage of the framework in the Finnish context. In the example, we demonstrate the data-collection functionality by studying the impact of non-participation on the estimated risk models.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/06/2020

Easy, Reproducible and Quality-Controlled Data Collection with Crowdaq

High-quality and large-scale data are key to success for AI systems. How...
02/09/2018

Towards realistic HPC models of the neuromuscular system

Realistic simulations of detailed, biophysics-based, multi-scale models ...
06/13/2022

Consent verification monitoring

Advances in service personalization are driven by low-cost data collecti...
02/09/2022

Constructing synthetic populations in the age of big data

To develop public health intervention models using microsimulations, ext...
07/25/2020

Large scale simulation of pressure induced phase-field fracture propagation using Utopia

Non-linear phase field models are increasingly used for the simulation o...
05/03/2021

Analysis of zero inflated dichotomous variables from a Bayesian perspective: Application to occupational health

This work proposes a new methodology to fit zero inflated Bernoulli data...
12/23/2020

The necessity and power of random, under-sampled experiments in biology

A vast array of transformative technologies developed over the past deca...