## 1 Introduction

Mortality modeling and prediction of its future trends can provide fundamental answers to several key questions related to longevity ranking and demographic sustainability, among others. However, in most of the cases, this is done by focusing on summary measures like life expectancy at birth or life disparity . For example [Amin and Steinmetz, 2019] link life expectancy with cardiovascular disease and cancer in US states finding spatial clusters with higher values of . In 1992, [Lee and Carter, 1992] developed a model to forecast mortality based on a matrix of the logged death rates by age and time decomposed into a single time-index and an age-pattern of mortality changes. This model is the same that lead [Tuljapurkar et al., 2000] to suggest the existence of an universal pattern of mortality decline. Life expectancy at birth is also applied to evaluate the precision of mortality forecasts, even though [Bohk-Ewald et al., 2017] has suggested that lifespan disparity could also be used. Lifespan disparity has also been advocated as an useful indicator to analyse the mortality evolution of countries ([Vaupel et al., 2011]). In other cases, scholars focus on specific components of mortality, disregarding the global pattern. For instance, [Medford et al., 2019], analyse lifespan after age 100 in Sweden and Denmark to show that Danish centenarians lifespans have been lengthening, but not those of Swedish. As another example, [Zanotto et al., 2020] focus their analysis on premature mortality. Therefore, it looks like that analysing or predicting mortality evolution of one or more countries means choosing among focusing on global mortality or a specific component. In this work, we suggest to apply Functional Data Analysis (FDA) approach to mortality data. Such approach ([Ramsay and Silverman, 2002]) is increasingly gaining ground among scholars interested to analyse curves rather than scalar data, and mortality profiles (e.g. in terms of age-specific rates) can be seen as curves over age that can be observed for every country and every year.

More specifically we propose a functional clustering of mortality profiles of several countries. [Hatzopoulos and Haberman, 2013]

already tried a clustering solution, using a fuzzy c-means cluster analysis based on the main time trends, which are estimated by means of a GLM model, supporting the idea of a single mortality pattern of mortality decline across subpopulations. However, other authors contrast this hypothesis. For instance,

[McMichael et al., 2004] show there is an increased heterogeneity across countries, even though it should be noted that in their analysis both developed and poor countries are considered. We suggest that taking a functional perspective can be a more informative approach as it allows to cluster countries on the base of global mortality profiles without loosing sight of the role played by single components. Indeed, combining functional clustering with a Principal Component analysis permits to identify the components that determine the exclusion or inclusion of a country into a specific cluster. In this way, we can see whether countries mortality are evolving in the same way (i.e. following the same sequence of clusters) or different patterns are found.The remainder of this paper is organised as follows: in the next section FDA-based clustering techniques are exposed in detail, while in section 3 we explain our choice of data. Results of the analyses are reported in section 4, while section 5 concludes.

## 2 Functional Clustering methods

Functional data analysis (FDA) deals with the analysis of data that are in the form of functions and extends the classical multivariate methods. The monographs on functional data by [Ramsay and Silverman, 2002, Ramsay and Silverman, 2005] developing methodology and applications and the book of [Ferraty and Vieu, 2006] on nonparametric models contain a review of the most recent contributions on this topic. Our work proposes to study mortality data using FDA, more specifically functional cluster analysis. Because of the nature of data itself, belonging to an infinite dimensional space, clustering functional data is generally a difficult task and several approaches have been proposed along the years. A review of clustering methods can be found in [Jacques and Preda, 2014]. We present in this section how to obtain a functional representation of data, the theory of functional principal component analysis and the major functional approaches for data clustering.

### 2.1 Functional data

Functional approach to the analysis of data considers a collection of discrete observations at a finite set of instants as coming from a continuous underlying function defined on . Functional data consist then of a set of curves defined on a common interval , denoted in the case of observations at the same istant as

(1) |

The curves are assumed to be independent realizations drawn from the same continuous stochastic process belonging to space.

Because functional observations are supposed to belong to an infinite dimensional space, the first step in a FDA is often the reconstruction of the functional form of data. As a procedure of functional representation we approximate the function by using a basis expansion of cubic B-splines functions. Let us consider known basis functions , the basis expansion for is

(2) |

where are the basis function coefficients to be estimated. Resulting spline functions are piecewise polynomials defined into subintervals, with boundaries at points called breaks. Given an order () and internal knots, there are basis functions.

The curves are observed with error, therefore B-splines

basis coefficients can be estimated by ordinary least squares method minimizing the sum of squared residuals. There exist many possible approaches to control the irregularity of the curve and obtain a better approximation. The one we use is a roughness penalty method, which modifies estimation criterion adding a penalty term, so the penalised sum of squared errors (PSSE) is:

(3) |

where is the basis expansion of each curve and with are discrete observations for the -th curve. Here the roughness of the curve is measured by its integrated squared second derivative and the smoothing parameter controls the trade-off between the closeness of fit to the average of the data and the variability of the curve. In practice, it is common to choose smoothing parameter subjectively or select it through generalized cross-validation criterion.

### 2.2 Functional Principal Component Analysis

Functional Principal Component Analysis (FPCA) is the extension of the more classical multivariate PCA to functional data and represents a useful tool for displaying curves into a reduced dimensional space. FPCA is one of the main tools considered when clustering functional data, but it can be also applied for data projection and interpretation.

Given functional observations , , let be the estimate of the mean function. The estimated covariance function, in analogy with the covariance matrix of the multivariate case, is defined as:

(4) |

As in the multivariate case, under mild assumptions Mercer’s theorem ([Mercer, 1909]

) leads to a spectral decomposition, providing a countable set of positive eigenvalues

associated to a basis expansion of orthonormal basis functions with such that(5) |

In standard terminology, the basis functions

are the eigenfunctions or harmonics. The eigenvalues measure the variability in the directions corresponding to the eingenfunctions.

Principal components are zero-mean uncorrelated random variables, defined on the same interval of the functional data, with variance

. Principal component scores of the unit in the dataset are defined as(6) |

With these definitions, the fundamental result of [Karhunen, 1946] and [Loève, 1946] expansion holds and allows to obtain the approximation for a generic curve

(7) |

If one consider the first terms of the decomposition, the expression leads to a possible dimension reduction. The information on the curve is then synthesized by the

-dimensional vector

.### 2.3 Functional cluster analysis

The infinite dimensionality of functional data constitutes a common problem to all clustering methods and leads to some additional difficulties, like the lack of definition for probability density of a functional random variable, the definition of distances or estimation from noisy data. To overcome these problems several methods have been developed that can be mainly grouped on three approaches: two-stages clustering, distance-based clustering or non parametric clustering and model-based clustering.

#### 2.3.1 Two-stages approach

Two-stages approach first reduces data dimension by approximating the curves with a finite number of parameters (filtering step) and then uses clustering algorithms for finite dimensional data (clustering step). Filtering step can be performed either by curves’ coefficients in a basis of functions or by their first principal components and classical clustering algorithms can be used on them. From a computational point of view, reduction technique by functional principal component scores also needs a basis expansion of curves. The first contribution to two-stages methods is due to [Abraham et al., 2003], where k-means clustering is based on B-splines coeffients.

#### 2.3.2 Distance-based approach

Distance-based methods for clustering consist generally in defining specific distances or dissimilarities for functional data and then apply clustering algorithms with a hierarchical or a k-means

method. Indeed, because of the large (infinite) number of variables in the functional context the use of classical distances is affected by the curse of dimensionality. Moreover, considering distances can become too restrictive, while the use of a semimetric – instead of a distance, leads to a reduction of functional space and authorize to consider as equal functional objects that are actually different.

A semimetric in a functional space is defined as an application on that takes values in such that autosimilarity, symmetry and triangle inequality are fullfilled, but not identity property

(8) |

The families of semimetrics most widely used are based on derivatives and on principal components ([Ferraty and Vieu, 2006]). Principal components can be used for computing proximities between two curves and in a reduced dimensional space, considering a truncated version of their basis expansion. In case of discrete observations and , the empirical version of the semimetric is

(9) |

with the number of principal components. The semimetric corresponds to the distance between -dimensional vectors of principal component scores of and .

#### 2.3.3 Model-based approach

Model-based approach constructs omogeneous clusters by means of a density mixture model and allows to predict the membership of each observation to one of the clusters. Conditionally to the membership at a cluster, the observations are supposed to come from a common distribution with cluster specific parameters. In the finite dimensional setting, the main tool to estimate the model is the multivariate probability density. In the case of functional data the probability density is not defined, so we assume a density probability on the parameters describing the curves. The first model-based clustering method for functional data has been developed by [James and Sugar, 2003]. We here describe the clustering method proposed by [Bouveyron and Jacques, 2011].

Let be an unobserved random variable indicating the group membership of : is equal to 1 if belongs to the group and 0 otherwise. The clustering task aims therefore to predict the value of for each observed curve . Each curve can be summarized by its basis expansion coefficient vector , whose distribution is assumed to be a mixture of Gaussians with density

(10) |

where is the Gaussian density function and

the prior probability of group k. This model is referred to as the Functional Latent Mixture (FLM) model (

[Bouveyron and Jacques, 2011]), since it can be reparametrized and represent the curves through their group-specific eigenspace projection. The spectral decomposition of the matrix

allows to model and interpret the variance of the data of the th group through the parameters and the variance of the noise through parameters , where can be considered as the intrinsic dimension of the latent subspace of the th group (). Differently than two-stages methods, in which the estimation of these parameters is done previously to clustering, the two tasks are performed simultaneously in this approach. The funHDDC algorithm ([Bouveyron and Jacques, 2014]) models and clusters the curves through their projections in the group-specific subspaces obtained by performing functional principal component analysis conditionally on the posterior probabilities of belonging to group

.## 3 Data

We choose data from Human Mortality Database ([Human Mortality Database, 2019]), that ensures a high quality and quantity of data on mortality profiles of many European and some non-European countries for several year. From the 40 countries available we excluded those with too short time series available (Chile, Croatia, Greece, Israel, Slovenia, Korea, and Taiwan) and those with a too limited population size (Luxembourg and Iceland). As for the time period, we chose to consider data from 1960 (after the Second World War and related economic crises) to 2010. Considering that we need to split Germany into East and West, in order to have data back to 1960, we end up with data for 32 countries and 50 years (see Table 1).

Area | Countries |
---|---|

North EU | Denmark, Finland, Norway, Sweden |

West EU | Austria, Belgium, Switzerland, Germany, France, Ireland, The Netherlands, United Kingdom |

South EU | Italy, Portugal, Spain |

Center EU | Bulgary, Czech Republic, Hungary, Poland, Slovakia |

Est EU | Belarus, Estonia, Latvia, Lithuania, Russia, Ukraina |

Extra-EU | Australia, Canada, Japan, New Zealand, United States of America |

This means that for each combination of country and year we have a curve of mortality age pattern. Usually age-specific rates are used for mortality analysis, however we chose to use the age distribution of deaths (). We do that because one of the most acknowledged transformation of mortality age patterns in developed countries in the past decades is the shifting (see, for instance, citeVCanudas2008) of the modal age at death and the compression of deaths above the mode ([Thatcher et al., 2010]). More recently, [Zanotto et al., 2020] have shown that premature mortality has also evolved in the last years, with different patterns for several countries. All these transformations are much better visible from the age distribution of deaths () than from the age-specific rates () and this explains why recently new models fitting the are emerging ([Mazzuco et al., 2018, Basellini and Camarda, 2019]). As [Basellini and Camarda, 2019] note, mortality rates (), survival probabilities () and age distribution of deaths () are complementary functions, and each one can be derived by each of the others. This means they convey the same information, so choosing one or the other does not affect the results of the cluster analysis, but, as said above, using will allow to more easily visualise the transformations of mortality profiles of selected countries.

## 4 Results

### 4.1 From raw data to smooth curves

The analyses of this section will focus on the study of mortality curves to understand the patterns or trajectories through the selected time period for the principal developed countries. Although in functional analysis there is no general requirement for the data to be smooth, we can find in some cases, particularly noisy data which makes smoothing necessary. This problem affects the most the curves of Eastern countries at the beginning of the time period, and is attributable to quality of data.

We used the R package fda of [Ramsay et al., 2011] to obtain a functional representation using a basis expansion of natural cubic splines. In order to mantain the data structure two sequences of knots over the age range [0,110] have been evaluated: a sequence of 111 equally distributed knots (i. e. one for every age) and a sequence of 31 knots, one every 3 months over age interval [0,2], one every 5 years over age interval [2,110]. The latter has been preferred to the former, not only as it is more parsimonious, but even because it is preferable in terms of goodness of fit. As an example, in Figure 1 both solutions of knots sequence are applied for the curve of Russia in 1960.

The comparison reveals that 31 knots unequally distribued allow to account for the steep decreasing of infant mortality in the first two years and the unicity of the mode distribution. In this example, the smoothing parameters for each of the two sequences of knots have been selected through Generalised Cross-Validation (GCV) criterion ( with 111 knots, with 31 knots). GCV is mean-squared error based measure, twice discounted by a term taking into account number of parameters and magnitude of smoothing parameter.

In the following analyses, considering the curves for all the countries and years, two alternatives for the smoothing parameter have been applied: a common smoothing parameter for all curves () and a different smoothing parameter for each curve. As the results for the two alternatives do not show any relevant differences, we present only the ones obtained with a curve specific .

### 4.2 Exploratory analysis through FPCA

The FPCA has been performed in order to synthetize the variability of the curves. Many works on mortality evolution treat separately data for males and females, due to the fact that they experienced in the past different mortality trends. In the same spirit, we represented functional data with curve-specific smoothing parameters for both sexes and conducted two FPCA. It emerged that most of the variability is explained by the first two principal components both for men and for women (95% for both). A classical way to interpret the principal component functions is to plot the group mean function as well as the functions obtained by adding and subtracting to the mean function twice the square root of the principal component variance (), with the eigenvalue of the th component). Refer to [Ramsay and Silverman, 2002, Ramsay and Silverman, 2005] for more details on this usual representation.

In Figure 2 for each of the first two components three curves are plotted: the dashed curve is the overall smoothed mean, which is the same by sex, whereas the other two curves show the effect of adding and subtracting a suitable multiple of the principal component weight function.

Looking at men (a) one can see that the first component corresponds to a shift of the curves respect to the overall mean of deaths over the entire age range. The (+) curve has an higher mortality of the (-) curve respect to the mean curve before 80 years, lower afterwards. In addition, an increase in the number of deaths is observed around modal age at death. A high scorer on this component woulds show above-average shift. For what concerns the second component, the variability is concentrated between 20 and 60 years and around modal age at death. This variability opposes the (-) and (+) curves, which cross at 65 years approximately. The (-) curve lies above (+) curve beween 20 and 60 years and below (+) curve around modal age at death. A high scorer on this component expresses a low premature mortality between 20 and 60 years. Therefore, we can summarize that the first component is representative of the shift and compression of deaths distributions observed in the latest dacades, while the second component is related to premature mortality.

This is already an interesting results, as it confirms that shift and compression of mortality schedules are intertwined ([Bergeron-Boucher et al., 2015]) and it reveals that premature mortality component is independent on shift and compression and 15% of variability in men mortality schedules are attributable to it. Concerning the women (b), the first component reflects a shift and compression from age 40 throughout adulthood and senescence, weaker compared to men. The second component (6% of variability) shows an increase in the number of deaths around modal age at death not attribuable to a clear change in premature mortality.

The principal subspace allows to give each individual a score in terms of the attributes expressed by principal components. In Figure 3 we can find the scores of seven representative countries (Denmark, Sweden, Japan, France, Czech Republic, United States, Russia) on the two first components, selected every 10 years over the time period for ease of interpretation (see Appendix A for the plot of all considered countries).

The first principal subspace for men (a) shows similar trajectories on the first component for Denmark, Sweden, Japan, France, Czech Republic, and United States. The first axis discriminates these countries throughout the whole period from the I quarter to the II quarter; the decrease of the scores reflects the shift of mortality curves towards older ages with respect to the mean curve. Even though direction is similar, the pace of evolution differs among countries; Sweden starts in advance compared to the other countries, a delay of two decades can be noticed for Czech Republic and Denmark stagnates between the 70s and 90s. The second axis indicates for Denmark, Sweden, Japan, France, Czech Republic, and United States different levels of premature mortality already from the starting point in the I quarter. United States starts lower than other countries reflecting an higher premature mortality, while the curves of Japan and Sweden are the ones with an higher number of deaths around modal age at death. Again, we can observe the stagnation of Denmark on the second component between the 70s and 90s. Only Russia experiments a completely different trajectory, remaining for the whole period in the IV quarter; the units with the higher premature mortality are the ones of years 1990–2010. Female first principal subspace (b) reports the trajectories in terms of ‘shift’ and ‘compression’, characterized by a shift of mortality curves for Denmark, Sweden, Japan, France, Czech Republic, and United States and a permanency of curves on lower ages for Russia.

The decomposition property of Karhunen-Loève turns out useful for the evaluation of the appropriate number of principal components to obtain an exact approximation of the curve. Figure 4 shows the reconstruction of two smoothed curves (France and Japan in 2010) obtained from the mean curve by adding the principal components one at a time. We can see for men that considering the first two components is not enough and the first six components are needed. As a consequence, one has to be careful also on the interpretation of French increase in male premature mortality in the last years of time period, as suggested by negative scores on the second component (Figure 3 (a)). In this case, the effect of the second component is over-estimated and reduced by the following four components.

The difficulty of reconstructing the curve with the first two components could be linked to the change of shape of death distribution. In this respect, [Mazzuco et al., 2019] hypothesize that premature mortality is not increasing because of a specific cause of death, but rather the shift and compression in a reduced range at older ages could have isolated and emphasized it.

The reconstruction of the curve of Japan in 2010 (Figure 4 (b)) suggests another pattern of evolution. Both the principal components express the compression of the curve and the second reduces the effect of the first one. Therefore the high value of second component does not only indicates a decrease in the number of deaths around modal age at death compared to the mean curve, but rather a faster pace for shift than compression. The phenomenon has been described by [Canudas-Romo, 2008] as a “shifting mortality scenario, where bulk of deaths around the modal age at death move toward older ages and the compression of mortality has stopped. This may be a realistic description of the current situation in low mortality countries.”

### 4.3 Analysis of mortality evolutions

In this section we present the results of the classification of mortality curves for the 32 countries selected from 1960 to 2010. The three methods of cluster analysis have been carried out, reflecting the three main approaches for functional clustering: two-stages approach on the coefficients of basis expansion of the curves, model-based approach with the FLM model and distance-based approach through semimetric based on FPCA. We chose to show distance-based method for women and model-based method for men. This choice stems from the different pattern shown by men and women mortality cluster solutions and would be explained in the following. Mortality curves and corresponding mean curves within the clusters will be analysed, as well as the composition of clusters in terms of countries and years. Model-based method has been performed with the package funHDDC ([Bouveyron and Jacques, 2014]), whereas for distance-based approach through semimetric we used the package fda.usc ([de la Fuente and Febrero-Bande, 2011]), which extend the functionalities of fda package (see Endnote ^{1}^{1}1The R code used is available on github, so results are fully replicable.).

#### 4.3.1 Cluster analysis for men

A cluster analysis has been conducted for men following model-based approach. Since we obtained the data in a common acquisition process, a consequent natural assumption is that the behavior of the error components outside the class specific subspaces is common. Thus we modeled the noise outside the latent subspace of the group by a single parameter and chose the reduced model . The number of clusters has been selected according to BIC criterion, defined with a positive log-likelihood. As a monotone trend for BIC is not verified and the BIC doesn’t stabilize, but two local maxima occur at clusters and , we show the partition in seven clusters, which allows for more flexibility and better interpretation. Table 2 reports the informations on the model within the clusters. The dimensionality varies between 1 and 5 and the complexity of the model, controlled by and , is equal to 843 parameters. The number of parameters (, ) corresponds to the eigenvalues selected for every . The stability of cluster dimensions has been verified by initializing the classes of funHDDC algorithm with k-means function and setting different seeds.

The partition in 7 clusters (Figure 6) identifies the curves with high infant mortality and the shift towards older ages; furthermore, it groups less compressed curves and those with higher premature mortality. Cluster 1 contains the curves with high premature mortality (4% on average) and cluster 3 the ones with a similar shape but lower infant mortality (2% on average). Cluster 2 expresses the increase in premature mortality and a stronger decrease of the number of deaths around modal age at death. The curves in cluster 4 are more compressed and the number of deaths is lower around modal age at death. Cluster 5, 6, and 7 show continuous shift and compression of mortality curves.

Cluster 1 | |||||||
---|---|---|---|---|---|---|---|

Cluster 2 | |||||||

Cluster 3 | |||||||

Cluster 4 | |||||||

Cluster 5 | |||||||

Cluster 6 | |||||||

Cluster 7 |

We summarize the features of the evolution emerged from the analyses of the six areas, highlighting the specificities of some countries. Northern, Western, Southern, and extra-European countries experiment a shift of the curves and an increase in the number of deaths around modal age at death over the whole period (cluster 4, 5, 6, 7). Norway, Sweden, and Netherlands, known for their high values of longevity, are in advance from the beginning of the time period (starting already from cluster 5). Among Western countries, Switzerland and France anticipate the evolution process (first ones passing to cluster 7), while Eastern Germany and Ireland evolve slower than other countries. The analyses also identify the delay of Finland in the first twenty years and the stagnation Denmark between 70’s and 90’s, which lays behind all countries at the end of time period (last passing to cluster 7). Southern countries show the highest infant mortality in the early twenty years (cluster 1) and then follow the shifting and compression process of mortality curves already described (cluster 5, 6, 7). Among extra-European countries, United States are characterized, for the first two decades, by less compressed curves with respect to other countries (cluster 3). The rapid shift of Japan from the second half of the period is coherent with the strongest increase recorded in longevity (already in cluster 7 in 1985).

For Central countries one can observe an high infant mortality in the first decade of the period (cluster 1), made exception for Czech Republic. Then, their evolutions differ considerably. Czech Republic evolve similarly to Western countries but cumulates a delay of about twenty years with respect to them (reaching cluster 6 in 2005). In Hungary premature mortality starts to increase from the 80’s and continue until 2010 (cluster 4), with no sign of reversing. Bulgaria, Poland, and Slovakia stop compression in the 80’s and 90’s (cluster 3) and show a slight shift in the last decade (cluster 5). The greatest difference between curves concern Eastern Europe. Countries of this area are characterized until the end of 80’s by less compressed curves with respect to the other areas (cluster 3) and after the dissolution of USSR by an increase in premature mortality (cluster 2).

The other two methods, two-stages and distance-based, identify the evolutions based on the same components of mortality with a lower number of clusters (see Appendix A). The advantage of model-based method consists in the possibility of analyzing the functional subspaces in which data are modeled and classified. As usual, we can plot for each cluster and each selected component the group mean function and the effect of component’s variance (

, with the eigenvalue of the th component). This representation is shown for clusters 2, 3, 5, 7 (Figure 6), since they highlight interesting features.Cluster 2 of Eastern countries after 1990 with high premature mortality is still characterized by an high local variability due to large differences in the departure of curves from the mean. Cluster 3 appears to captures both higher premature mortality of the first half period of Estern countries and mortality compression of the second half period of Central Europe countries, implying an higher dimensionality with respect to the other clusters. For what concerns cluster 5, the variability is explained by shift towards older ages and compression around modal age at death, in coherence with the trend already described for Northern, Western, Southern, and extra-European countries. Cluster 7 express the phenomena highlighted for France by principal component subspaces and the reconstruction of the curves: the shift without compression leading to a stop in the increase of number of deaths around modal age (first component) and the change of shape on the right side of the curve between ages 40 and 60 (component 2). To conclude, the analysis of variability within clusters reveals it does not exists a common dimension expressing a common evolution. However an increasing cumulated explained variability can be noticed from cluster 5 (77%) to 6 (87%) and 7 (93%), reflecting a certain convergence of evolutions for Norther, Western, Southern, and extra-European countries.

#### 4.3.2 Cluster analysis for women

A hierarchical cluster analysis has been performed for women according to distance-based approach with the semimetric based on the first four FPC (Figure

7). Our decision to keep four components is due to the necessity of an approximation of the curves accounting for infant and premature mortality. The number of clusters has been chosen identifying the changes on curves regarding the different components of mortality. In particular, the partition in five clusters allows to disinguish the decrease in infant mortality, the shift of the curves to the right and the increase in the number of deaths around modal age at death.As we can see from Figure 7, cluster 1 contains the curves with high infant mortality (4% on average), in cluster 2 the curves have the same shape but lower infant mortality (2% on average). Cluster 3, 4, and 5 identifys the curves characterized by a shift to the right and a compression around modal age at death.

The curves with high infant mortality (cluster 1) corresponds to the first decade of Southern countries and some Central countries (Bulgary, Hungary, and Poland). Portugal, Hungary, and Poland mantain an high infant mortality for a longer period compared to other countries (until mid of 70’s). The curves of Northern, Western, Southern, and extra-European countries experiment over time period a shift towards older ages (clusters 2, 3, 4, 5). The countries anticipating the shifthing process are Norway, Sweden, and Netherlands at the beginning of 70’s, Sweden, Switzerland, France, Spain, and Japan at the beginning of 90’s. Denmark lags far behind over the second part of time period and is the last passing to cluster 5 in 2004. The countries of Central Europe and Eastern Europe experiment a long stationary period (cluster 2) and a shift of the curves to the right during the last decade. Although the shift is similar to the one of previous areas, it occur at a slower pace with a delay of about twenty years. Czech Republic, Poland, and Baltic countries seems to be in slight advance (passing to cluster 3 and 4).

In the case of women, within-cluster variability does not convey any additional information, as premature mortality is much less important, so model-based analysis does not give any additional insight with respect to distance-based one. The other two methods, two-stages and model-based, identify the evolutions based on the same components of mortality (see Appendix A), even the shift to older ages in the former is emphasizes in the first half period, in the latter in the second half period.

To sum up, the analysis for men and women show similar evolutions for Northern, Western, Southern, and extra-European countries, characterized by the shift of curves to older ages and by the concentration of adult mortality around modal age at death. For these four areas we can thereby conclude the existence of a common pattern of evolution. In the case of men all the countries belong to the same group at the end of time period, supporting the hypothesis of an increasing homogeneity. The situation is more heterogeneous for Central and Eastern countries since they don’t experiment the same evolution and at the end of time period they don’t come to the same cluster.

The comparison of the analyses for men and women revealed two different scenarios for Eastern countries, characterizing the increase of premature mortality after 1990 is entirely a male phenomenon. In addition, the shift of the curves towards older ages for Baltic countries is detected only for women. Other differences can be noticed between the two sexes looking at the single countries. The stagnation of Denmark is more pronounced for women in line with [Lindahl-Jacobsen et al., 2016], that attribute the stagnation of life expectancy beween 1977 and 1995 to a worsening of health conditions of cohort of women born in the interwar period, linked to smoking behaviour.

## 5 Concluding Remarks

In this paper, we have inspected the evolution of mortality schedules in HMD countries by means of a functional clustering method, which allows us to consider mortality patterns as functions, avoiding analysing only a component of mortality (e. g. infant mortality or old-age mortality) or a summary measure like life expectancy, which is a mixture of all mortality components but without a clear distinction of their contribution to longevity progresses.

Three different method of functional clustering have been considered: a principal components based method (FPCA), a distance-based method and a model-based one. The latter method has provided the best results in terms of fit with real data, but FPCA has been also useful in determining what are the most relevant components that drove the transformations we observed in the last sixty years in HMD countries. It turned out that two components account for 94% of variability in mortality schedules considered: 84% for a component that can be explained in terms of shifting and compression of mortality, and 10% for a second component accounting for premature mortality. This demonstrates that shift and compression processes are mutually dependent, while premature mortality is an additional independent component, which accounts for a much lower (10%) but not irrelevant share of variability.

The results from clustering also provides us with many insights, although none of them comes as surprise. First the results confirm that an homogeneisation is taking place among most of the considered countries, as many of them follow the same evolution through the clusters. However, men and women patterns are quite different, since for men most of the countries are incuded in the same cluster in the latest years and countries from Eastern Europe not only lag behind with respect to cluster 7, but also do not show signs of a recovering process. Women situation is a bit different because homogeneity of Northern, Western, Southern Europe countries and extra-European ones is less pronounced (Denmark, United Kingdom, United States, East Germany did not reach the highest longevity cluster) but Central and Eastern Europe countries look much closer to precursors.

The difference between men and women is also characterised by the higher importance that premature mortality has for the formers, so if Eastern Europe countries men longevity is still stagnating, that is partly attributable to premature mortality, notably high in that area. Results also show clearly the stagnation period that Denmark and United States underwent in different periods, much more visible for women. Such a stagnation prevented these two countries to join the highest longevity group. Considering the latest evolution of United States longevity ([Woolf and Schoomaker, 2019]), such lag is going to persist (or even increase) for this countries, while Denmark seems be catching up, as it also can be seen from Figure 3.

This work, however, was also meant to show the potentialities of Functional Data Analysis demographic studies, where the leading forces of population growth (fertility, mortality and migration) are often measured in terms of age-specific rates or probabilities, that reveals several components. So similar analyses can be implemented on fertility and migration age patterns. Moreover, FDA allows other kind of analyses: regression (both on scalar and functional covariates) and hypothesis test. Therefore, we advocate an increasing implementation of such approach to population studies.

## Acknowledgments

Stefano Mazzuco acknowledges the support from MIUR–PRIN 2017 project number 20177BR-JXS.

## References

- [Abraham et al., 2003] Abraham, C., Cornillon, P.-A., Matzner-Løber, E., and Molinari, N. (2003). Unsupervised curve clustering using b-splines. Scandinavian journal of statistics, 30(3):581–595.
- [Amin and Steinmetz, 2019] Amin, R. W. and Steinmetz, J. (2019). Spatial clusters of life expectancy and association with cardiovascular disease mortality and cancer mortality in the contiguous united states: 1980-2014. Geospatial Health, 14(1).
- [Basellini and Camarda, 2019] Basellini, U. and Camarda, C. G. (2019). Modelling and forecasting adult age-at-death distributions. Population Studies, 73(1):119–138. PMID: 30693848.
- [Bergeron-Boucher et al., 2015] Bergeron-Boucher, M.-P., Ebeling, M., and Canudas-Romo, V. (2015). Decomposing changes in life expectancy: Compression versus shifting mortality. Demographic Research, 33(14):391–424.
- [Bohk-Ewald et al., 2017] Bohk-Ewald, C., Ebeling, M., and Rau, R. (2017). Lifespan disparity as an additional indicator for evaluating mortality forecasts. Demography, 54(4):1559–1577.
- [Bouveyron and Jacques, 2011] Bouveyron, C. and Jacques, J. (2011). Model-based clustering of time series in group-specific functional subspaces. Advances in Data Analysis and Classification, 5(4):281–300.
- [Bouveyron and Jacques, 2014] Bouveyron, C. and Jacques, J. (2014). funhddc: model-based clustering in group-specific functional subspaces. R package version, 1.
- [Canudas-Romo, 2008] Canudas-Romo, V. (2008). The modal age at death and the shifting mortality hypothesis. Demographic Research, 19:1179–1204.
- [de la Fuente and Febrero-Bande, 2011] de la Fuente, M. and Febrero-Bande, M. (2011). Utilities for statistical computing in functional data analysis: The package fda. usc.
- [Ferraty and Vieu, 2006] Ferraty, F. and Vieu, P. (2006). Nonparametric functional data analysis: theory and practice. Springer Series in Statistics, Springer-Verlag.
- [Hatzopoulos and Haberman, 2013] Hatzopoulos, P. and Haberman, S. (2013). Common mortality modeling and coherent forecasts. an empirical analysis of worldwide mortality data. Insurance: Mathematics and Economics, 52(2):320–337.
- [Human Mortality Database, 2019] Human Mortality Database (2019). University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). Available at www.mortality.org, Last accessed on 2019-04-30.
- [Jacques and Preda, 2014] Jacques, J. and Preda, C. (2014). Functional data clustering: a survey. Advances in Data Analysis and Classification, 8(3):231–255.
- [James and Sugar, 2003] James, G. M. and Sugar, C. A. (2003). Clustering for sparsely sampled functional data. Journal of the American Statistical Association, 98(462):397–408.
- [Karhunen, 1946] Karhunen, K. (1946). Zur spektraltheorie stochastischer prozesse. Ann. Acad. Sci. Fennicae, AI, 34.
- [Lee and Carter, 1992] Lee, R. D. and Carter, L. R. (1992). Modeling and forecasting us mortality. Journal of the American Statistical Association, 87(41):659–671.
- [Lindahl-Jacobsen et al., 2016] Lindahl-Jacobsen, R., Oeppen, J., Rizzi, S., Möller, S., Zarulli, V., Christensen, K., and Vaupel, J. (2016). Why did danish women’s life expectancy stagnate? the influence of interwar generations’ smoking behaviour. European journal of epidemiology, 31(12):1207–1211.
- [Loève, 1946] Loève, M. (1946). Fonctions aléatoires à décomposition orthogonale exponentielle. La Revue Scientifique, 84:159–162.
- [Mazzuco et al., 2018] Mazzuco, S., Scarpa, B., and Zanotto, L. (2018). A mortality model based on a mixture distribution function. Population Studies, 72(2):191–200.
- [Mazzuco et al., 2019] Mazzuco, S., Zanotto, L., and Suhrcke, M. (2019). What is premature mortality? trying to reconcile two views.
- [McMichael et al., 2004] McMichael, A. J., McKee, M., Shkolnikov, V., and Valkonen, T. (2004). Mortality trends and setbacks: global convergence or divergence? The Lancet, 363(9415):1155 – 1159.
- [Medford et al., 2019] Medford, A., Christensen, K., Skytthe, A., and Vaupel, J. W. (2019). A cohort comparison of lifespan after age 100 in denmark and sweden: Are only the oldest getting older? Demography, 56(2):665–677.
- [Mercer, 1909] Mercer, J. (1909). Functions of positive and negative type, and their connection the theory of integral equations. Philosophical Transactions of the Royal Society of London. Series A, 209:415–446.
- [Ramsay and Silverman, 2005] Ramsay, J. and Silverman, B. (2005). Functional data analysis. Springer Series in Statistics, Springer.
- [Ramsay et al., 2011] Ramsay, J., Wickham, H., Graves, S., and Hooker, G. (2011). fda: Functional Data Analysis. R package.
- [Ramsay and Silverman, 2002] Ramsay, J. O. and Silverman, B. W. (2002). Applied functional data analysis: methods and case studies. Springer Series in Statistics, Springer-Verlag.
- [Thatcher et al., 2010] Thatcher, A. R., Cheung, S. L. K., Horiuchi, S., and Robine, J.-M. (2010). The compression of deaths above the mode. Demographic Research, 22(17):505–538.
- [Tuljapurkar et al., 2000] Tuljapurkar, S., Li, N., and Boe, C. (2000). A universal pattern of mortality decline in the g7 countries. Nature, 405:789–792.
- [Vaupel et al., 2011] Vaupel, J. W., Zhang, Z., and van Raalte, A. A. (2011). Life expectancy and disparity: an international comparison of life table data. BMJ Open, 1(1).
- [Woolf and Schoomaker, 2019] Woolf, S. H. and Schoomaker, H. (2019). Life expectancy and mortality rates in the united states, 1959-2017. JAMA, 322(20):1996–2016.
- [Zanotto et al., 2020] Zanotto, L., Canudas-Romo, V., and Mazzuco, S. (2020). A mixture-function mortality model: illustration of the evolution of premature mortality. European Journal of Population. (to appear).

Comments

There are no comments yet.