Recently, we have shown that the age-specific prevalence of a health state or disease can be related to the transition rates in the illness-death model (IDM) via a partial differential equation (PDE) [2, 3]. In case of a chronic disease, this relation can be used to estimate the incidence from a sequence of cross-sectional studies if information about mortality is available [4, 1].
In this paper, we demonstrate that it is also possible to estimate excess mortality from prevalence and incidence of a chronic disease, which can be useful for the analysis of data from disease registers or health insurance claims. For this, we examine the relations of the illness-death model and associated PDEs. In this context, we derive a new PDE which generalises the PDE of Brunet and Struchiner 
. In a simulation study, the new PDE is used to demonstrate how the excess mortality can be estimated directly. Furthermore, we present an estimation method in the framework of Bayesian statistics. In an application of the Bayesian approach, we estimate the excess mortality of diabetes from claims data comprising 70 million Germans.
Illness-death model and associated partial differential equations
We consider the illness-death model for chronic (i.e., irreversible) diseases shown in Figure 1. The considered population is split into the relevant disease states Healthy () and Ill (). From either states people can transit into the state Dead (). The transition rates between the three states are the incidence rate (), the mortality rate of the healthy () and the mortality rate of the diseased (). These rates depend on the calendar time and on the age . Additionally, the mortality rate depends on the duration of the disease.
Let the numbers and denote the numbers of people in the respective states. To be more specific, is the number of healthy people aged at time and is the number of diseased people aged at who are diseased for the duration We assume that the considered population is sufficiently large that and can be considered as smooth functions. The total number of subjects aged at who have the chronic disease is
Let us furthermore assume that the considered population is closed, i.e., there is no migration and that the disease is contracted after birth. The later assumption implies for all Then, we can formulate following equations for the change rates of and :
where means the partial derivative with respect to , i.e., for
The first initial condition represents the number of (disease-free) newborns , and the second initial condition describes the number of newly diseased persons, the incident cases.
In , we have shown that the age-specific prevalence
is the solution of a scalar PDE
that can be derived from the two-dimensional system (1) – (2). Here,
we choose a different approach. Instead of considering the age-specific prevalence ,
we follow the idea of Brunet and Struchiner and
examine the prevalence-odds
prevalence-odds. Using the terminology we obtain
For the second equality we used which has been proven in the Appendix of . The rate is defined as
The rate may be accessible in epidemiological surveys by choosing a sample population with representative distribution of disease duration. However, in most practical cases, it is unknown because the distribution in Eq. (3) is not known.
Thus, we obtain following linear scalar PDE
which shows how the temporal change of the prevalence-odds is governed by the rates in the illness-death model in Figure 1 and the value of the prevalence-odds itself.
Eq. (4) is equivalent to
For our purpose of estimating the excess mortality , Eq. (4) is very useful, because it holds
An advantage of the approach of Brunet and Struchiner lies in an explicit representation of the prevalence-odds in case the rates and are given. Then starting from Eq. (1) – (2) combined with the initial conditions of above, we obtain following equation by using calculus:
With we see that Eq. (8) is a generalisation of Eq. (1) in . One advantage of the explicit representation of in (8) is the possibility to (numerically) calculate with a prescribed accuracy, e.g. by Romberg integration , which we will use in the examples below. Numerical solutions of differential equations usually do not allow prescribed levels of accuracy.
Examples and demonstration
Direct estimation of excess mortality
The first example is about a hypothetical chronic disease with all time-scales and playing a role. The incidence of the chronic disease is assumed to be , which implies that the disease affects only people aged 30 and older. The age-specific mortality rate of the non-diseased is chosen to be In addition, we assume that the mortality of the diseased can be written as a product of and a factor that depends only on the duration :
Except for the time trend in this example is the same as Simulation 2 in .
For the example, we mimic the situation that we have three cross-sectional studies in the years and We calculate the prevalence odds for these years via Eq. (8). Figure 2 shows the prevalence-odds for the three years. Until age of about 70 years, the three prevalence-odds are virtually the same. For our example, we additonally assume that we have the age-specific incidence rate available for the year The aim is to estimate the excess mortality in .
The proposed method to estimate the excess mortality in the year is direct application of Eq. (7). As assumed the incidence for is assumed to be given. The partial derivative is approximated by following finite difference:
Then, we the excess mortality can be estimated by plugging these numbers into Eq. (7). In case the mortality rate of the non-diseased is known, is often expressed in terms of the hazard ratio which can be obtained from
The age-specific HR expresses the mortality rate of the diseased people relative to the non-diseased at the same age. For the hypothetical chronic disease we find the age-specific HR as in Figure 3. The age-specific HR is peaking between age and and falling with increasing age.
In case, the mortality rate of the non-diseased population is not known, can also be compared to the mortality rate of the general population. It holds . Usually, the mortality of the general population is accessible from vital statistics of the federal statistical offices.
Bayes estimation of excess mortality
The second example is about claims data from Germany during the years 2009 to 2015. Goffrier and colleagues reported the age-specific prevalence of diabetes of German men in the years and as shown in Figure 4, .
Based on the incidence rate () in 2012 (reported in Table 5 of ), our aim is to estimate the age-specific hazard ratio for the same year. For this, we use a Bayes approach. Motivated by empirical findings from the Danish diabetes register, we assume that the logarithm of the age-specific HR approximately is a straight line in the age range 50 to 90 years of age (see Figure 5 in ). Thus, we make the approach
For and we use weakly informative prior distributions and where
means the uniform distribution. In Bayes terminology, our aim is to estimate a-posteriori distributions forand
) into an ordinary differential equation (ODE) and then solve the ODE by the Runge-Kutta Method of fourth order. The calculated prevalence in 2015, , is then compared with the observed prevalence in 2015 as shown as blue line in Figure 4. Instead of the joint a-posteriori distribution, the log-likelihood of the deviation between observed and calculated prevalence is computed. The results are shown in Figure 5. The black cross indicates the maximum a-posteriori (MAP) estimator for these data, which is given by and
In this work, we have described how the illness-death model can be used to obtain information about mortality in case prevalence and incidence are given. This allows insights into the excess mortality of people with chronic diseases compared to the people without the disease or the general population.
-  Ralph Brinks, Annika Hoyer, and Sandra Landwehr. Surveillance of the Incidence of Non-Communicable Diseases (NCDs) with Sparse Resources: A Simulation Study Using Data from a National Diabetes Registry, Denmark, 1995–2004. PloS One, 11(3):e0152046, 2016.
-  Ralph Brinks and Sandra Landwehr. Age-and time-dependent model of the prevalence of non-communicable diseases and application to dementia in germany. Theoretical Population Biology, 92:62–68, 2014.
-  Ralph Brinks and Sandra Landwehr. Change rates and prevalence of a dichotomous variable: simulations and applications. PLoS One, 10(3):e0118955, 2015.
-  Ralph Brinks and Sandra Landwehr. A new relation between prevalence and incidence of a chronic disease. Mathematical Medicine and Biology, 32(4):425–435, 2015.
-  Ralph Brinks, Sandra Landwehr, Rebecca Fischer-Betz, Matthias Schneider, and Guido Giani. Lexis diagram and illness-death model: Simulating populations in chronic disease epidemiology. PLoS One, 9(9):1–8, 09 2014.
-  Robert C Brunet and Claudio J Struchiner. A non-parametric method for the reconstruction of age-and time-dependent incidence from the prevalence data of irreversible diseases with differential mortality. Theoretical Population Biology, 56(1):76–90, 1999.
-  Bendix Carstenson, J. K. Kristensen, P. Ottosen, and K. Borch-Johnsen. The Danish National Diabetes Register: Trends in Incidence, Prevalence and Mortality. Diabetologia, 51(12):2187–2196, 2008.
-  Germund Dahlquist and Ake Björck. Numerical Methods. Prentice-Hall, Englewood Cliffs, NJ, 1974.
-  Benjamin Goffrier, Mandy Schulz, and Jörg Bätzing-Feigenbaum. Administrative prävalenzen und inzidenzen des diabetes mellitus von 2009 bis 2015. Versorgungsatlas.
-  SEARCH Study Group. Search for diabetes in youth: a multicenter study of the prevalence, incidence and classification of diabetes mellitus in youth. Controlled Clinical Trials, 25(5):458–471, 2004.
-  Andrei D Polyanin, Valentin F Zaitsev, and Alain Moussiaux. Handbook of First-Order Partial Differential Equations. CRC Press, 2001.