1 Introduction
Despite the lack of nonanalytical expression for the probability density function (pdf), the class of
stable distributions are becoming increasingly popular in such fields as economics, finance, and insurance. Details for the applications of stable distributions in these fields can be found in [11], [15], [18], [20], [22], [23], and [24]. The only exceptions for which the pdf has analytical expression are Gaussian, Levy, and Cauchy distributions. This can be regarded as a major obstacle in the way of using this class in practice. This is while the characteristic function (chf) of
stable distributions has closedform expression and takes different forms, see [16] and [26]. In what follows we review briefly two important forms which are known in the literature as andparameterizations. If random variable
follows stable distribution, then the chf of , i.e., , in parameterization is given by(1) 
where =1 and is the wellknown sign function. The family of stable distributions has four parameters: tail thickness , skewness , scale , and location . If =0, it would be the class of symmetric stable (SS) distributions. If =1 and , we have the class of the positive stable distributions. In this case, varies over the positive semiaxis of real line. Also, the chf of in parameterization is given by
(2) 
where is the location parameter. The pdf and chf in parameterization are continuous functions of all four parameters, while the pdf and chf in parameterization both has discontinuity at . As it is seen, the parameterizations (2) and (1) differs only for the location parameter. The location parameters in and parameterizations, respectively shown by and , are related as
(3) 
Clearly, when both and parameterizations are equal.
Hereafter, we write and to denote the class of stable distributions in and
parameterizations, respectively. For the class of normal distributions with mean
and variance
, i.e., , the pdf at point is shown by . The generic symbolaccounts for the family of exponential distributions with rate parameter
, and denotes a Weibull distribution with pdf ; for , shape parameter , and scale parameter . Also , , and are pdfs of distributions , , and , respectively.This paper is organized as follows. In what follow firstly the EM algorithm and its extensions are reviewd briefly, then some properties of stable distributions are given. In Section 2, the proposed EM algorithm is introduced. Model validation using simulations and real data analysis is carried out in Section 3. Some conclusions are made in Section 4.
1.1 EM algorithm and its extensions
The EM algorithm is the most popular approach for estimating the parameters of a statistical model when we encounter missing (or latent) observations, see [5]. Assume that
denotes the vector of complete data with pdf
in which consists of observed and missing values, respectively. Also, denotes the likelihood function of complete data in which is the parameter vector. The EM algorithm finds the parameter vector that maximizes the conditional expectation of the loglikelihood of complete data given the observed data and a current guess , of the parameter vector. Usually, the EM algorithm works as follows.
Estep: given and , it computes .

Mstep: it finds such that is maximized.
Notice that refers to the logarithm of . Both steps of the EM algorithm are repeated until convergence occurs, see [13]. The convergence of the EM algorithm is guaranteed by [8]. In what follows we describe two extensions of the EM algorithm.
1.1.1 ECM Algorithm
When implementing the Mstep of the EM algorithm is mathematically impossible, an additional step is considered. Besides the E and Mstep, a sequence of the conditional maximization is carried out. The new step is known as the CMstep and the algorithm is known as the ECM algorithm. In the ECM algorithm, the CMstep finds the maximum of through maximizing the constrained marginal loglikelihood function, see [9] and [14].
1.1.2 Stochastic EM Algorithm
For a complete data of size , assuming we are currently at th iteration, each stochastic EM (SEM) algorithm amounts to a fourstep sequence given by the following.

Given and , a sequence of latent (or missing) variables, i.e., is simulated from posterior pdf ; for .

The simulated latent realizations are replaced into the loglikelihood function of complete data.

The EM algorithm is applied to the set of complete data, in which ; for , is the vector of observed and latent realizations.

The vector of parameters are updated as and then it is used to simulate from posterior pdf ; for .
Under some mild regularity conditions, by repeating above fourstep process for a sufficiently large number of cycles, the distribution of constitutes a Markov chain that converges to a stationary distribution, see [1], [3], and [6]. Contrary to the EM algorithm, for the SEM algorithm convergence occurs in distribution and in practice the number of cycles is determined through a graphical display. Suppose to be the length of burnin period and is a sufficiently large number, the SEM algorithm estimates as:
1.2 Some properties of stable distribution
Estimating the parameters of a stable distribution through the EM algorithm needs a hierarchy or stochastic representation. Here we give three useful results which play a large role to follow the next chapter.
Proposition 1.1
Suppose , , and . We have
(4) 
where denotes the equality in distribution, , , , and . All random variables , , and are mutually independent.
Proposition 1.2
Let be independent of . Then,
where , , and .
Proposition 1.3
Let be independent of . Then,
where and .
2 Proposed EM algorithm
Assume that constitute a sequence of identically and independent realizations of stable distribution in parameterization. The vector of complete data associated with (4) is shown by in which , are vectors of realizations of latent variables and , respectively. It turns out that representation (4) admits the following hierarchy.
(5) 
where , , and are defined after Proposition 1.1. Using hierarchy (2), the loglikelihood function of complete data is
where C is a constant independent of the parameters vector . The conditional expectation of , i.e., is
To complete the Estep, we need to compute the quantities ; for . Now, it is straightforward to show that
(6) 
where , , and . Details for evaluating in (2) is described in Appendix 1. Assuming we are currently at th iteration, all steps of the proposed EM algorithm including the E, M, CMstep, and maximizing the profile loglikelihood function are given by the following.

Estep: Given current guess of , i.e., , the quantity ; for is computed.

Mstep: Given , the parameter vector is updated as by maximizing with respect to . In the Mstep, the location parameter is updated as
and the updated scale parameter , is the root of equation in which ,
and
The updated skewness parameter is obtained as where
(7) 
CMstep: The tail thickness parameter is updated in the CMstep by maximizing the marginal loglikelihood function with respect to as
For this, at th iteration of the EM algorithm, we apply the SEM algorithm by three following steps to obtain .

Considering as the latent variable, we simulate from posterior distribution given and ; for , using the method described in Appendix 3.

Substitute in righthand side of (9) and maximize it with respect to to obtain . Obtaining , we go back to the step one and repeat the CMstep for cycles. This yields a sequence of updated tail thickness parameter as: . Now, the tail thickness parameter is updated as
where in the length of burnin period of the SEM and is updated tail thickness parameter at th cycle of the CMstep while we are at th iteration of the EM algorithm. Once we obtain , we go back and perform the EM algorithm from Estep for a sufficiently large number of iterations, say . After a burnin period of length , the EM algorithm converges to the true distribution.
Updating the skewness parameter by maximizing in (2) yields result that goes to zero. So, we update using optimization tools such as developed in package by maximizing profile loglikelihood function. For this, after obtaining EMbased estimations of , , and , namely, , , and , we maximize with respect to to obtain . Fortunately, this approach leads to satisfactory results.
3 Model validation using simulated and real data
Here, firstly, we perform a simulation study to compare the performance of the proposed EM algorithm with the ML approach for estimating the parameters of the class . To do this, data are generated by the method of simulating stable random variable (see [4]) and then the maximum likelihood (ML) estimations are evaluated using software, see [17]. Secondly, we apply the proposed EM and ML approaches to the five sets of real data.
3.1 Simulation study
We apply the proposed EM and ML approaches to the 200 sets of samples of size 300. For the sake of simplicity, we restrict ourselves to the case of . In each iteration, settings for the skewness and tail thickness parameters are: and . The results of simulations are shown in Figure 1 and Figure 2 for and , respectively. During simulations, we set , , , and . This means that the proposed EM algorithm is run for 140 iterations of which the first 100 iterations are removed as burnin period. Also, in each iteration, the CMstep is repeated for 40 times of which the average of the last 20 runs is considered as the updated tail thickness parameter.
Form Figure 1 and Figure 2, in the sense of square root of the meansquared error (RMSE) criterion, the following observations can be made.

The EMbased estimator of the scale parameter outperforms the corresponding MLbased estimator when (scale parameter is small).

The EMbased estimator of the tail thickness parameter, skewness, and scale parameters outperform the corresponding MLbased estimators when (scale parameter is large).
It should be noted that the proposed EM algorithm shows better performance than the sample quantile (SQ, see
[12]) and empirical characteristic function (CF, see [7]) approaches, and so were eliminated by competitions. Also, as it is known, the EM algorithm cannot outperform the ML method. Sometimes, as noted above, the EM algorithm shows better performance than the ML method. This is because, computes the evaluated ML estimators not the exact ML ones.3.2 Model validation via real data
The stable distribution is the most common used candidate for modelling the prices of speculative assets such as stock returns, see [2], [10], [19], and [23]. Here, we consider three sets of real data for illustrating an application of the stable distribution. The data are daily price returns of the major European stock indices including Switzerland SMI, France CAC, and UK FTSE. Following common practice for daily closing prices, we consider the transformed prices for business days as ; for . We then fitted the distribution to the transformed data by the four methods including ML, EM, SQ, and CF. We obtained three sets of data from package developed for (R Core Team [21]) environment which include observations. The results after applying above four approaches are given in Table 1. It should be noted that, for implementing the EM algorithm, the started , , , and are well away from the estimated values.
Estimated parameter  

Data set  Approach  Loglikelihood  KS  
SMI  EM  1.76460  0.15189  0.00541  0.00106  6168.845  0.03335 
ML  1.74707  0.19603  0.00543  0.00126  6168.528  0.02531  
SQ  1.60081  0.06607  0.00512  0.00096  6161.445  0.03350  
CF  1.81778  0.23881  0.00549  0.00128  6167.198  0.02456  
CAC  EM  1.84712  0.04423  0.00707  0.00054  5780.248  0.03018 
ML  1.86714  0.08863  0.00713  0.00062  5780.415  0.03256  
SQ  1.76230  0.09998  0.00685  0.00009  5773.361  0.04129  
CF  1.90333  0.08305  0.00712  0.00050  5779.068  0.03151  
FTSE  EM  1.86871  0.03927  0.00514  0.00045  6396.488  0.02097 
ML  1.86640  0.07575  0.00510  0.00049  6396.572  0.02237  
SQ  1.76710  0.05947  0.00498  0.00003  6389.608  0.03927  
CF  1.90178  0.08753  0.00512  0.00049  6395.707  0.02231 
It is clear from Table 1 that the proposed EM algorithm outperforms the M, SQ, and CF approaches with respect to either the loglikelihood value or the KS statistic. The time series plots of iterations and fitted pdf to the histogram for returns of CAC shares are depicted in Figures 3. We emphasize that both of loglikelihood and KS statistics reporets in Table 1 are evaluated using . Also the fitted PDF to the histograms in Figure 3 are drawn using when the PDF parameters are estimated through the EM algorithm.
4 Conclusion
We derive an identity for stable random variable that is scalelocation normal mixture representation. Based on this representation, we propose some EM algorithm to estimate the parameters of stable distribution. The proposed EM algorithm works very good for admissible ranges of the parameters, i.e., , , , and
. The steps of the proposed EM algorithm are: expectation, maximization, conditionally maximization, and profile loglikelihood maximization. Simulation studies reveal that the EMbased estimators of the tail thickness, skewness, and scale parameters outperform the corresponding maximum likelihood (ML) estimators when scale parameter is large. Also, the EMbased estimator of the scale parameter outperforms the corresponding ML estimator when scale parameter is small. This is not surprising since
computes the evaluated ML estimators not the exact ML estimators. The sample quantile and empirical characteristic function approaches were eliminated by competitions since the EM approach outperforms them. The simulations reveal that proposed EM algorithm is robust with respect to the initial values. Three sets of real data are used to demonstrate that the proposed EM estimators are close to that of the ML in the sense of goodnessoffit measures including the loglikelihood and KolmogorovSmirnov statistics. A great advantage of the proposed EM algorithm over the ML method is that the proposed EM algorithm can be applied to estimate the parameters of mixture of stable distributions. Further, since representation (4) can be adopted for the multivariate case, the proposed EM can be applied for the multivariate stable distribution and hence is an appropriate approach for modelling the returns of the dependent assets which have multivariate stable distribution. Two above privileges of the proposed EM algorithm can be considered as a possible future work.Appendix 1: Proof of Proposition 1.1
Let and denote two independent stable random variables. Define . We have:
where in above, to arrive at the expression after third equality, we use this fact that if , then . The expression after the last equality is the chf of . Now, with taking account into the fact that if , then can be represented as a Gaussian scale mixture model, i.e., where , see [25, p. 20]. Finally, set
, , and to see that . The proof is complete.
Appendix 2: Proof of Proposition 1.2
Suppose and . Define , to see that
where the last integral in above, i.e., the Laplace transform of random variable is ; for , see [25, pp. 15]. This means that and consequently
(10) 
where . On the other hand, as noted in Appendix 1, we can write where and are independent. It follows from (10) that
The result follows.
Appendix 3: Proof of Proposition 1.3
As the main part of the CMstep, implementing the SEM algorithm requires to simulate from posterior distribution of given and . For this purpose, we use the MetropolisHasting algorithm. As the candidate, we use the Weibull distribution with the shape parameter . Hence, the acceptance rate , becomes
Employing the MetropolisHasting algorithm, has a little chance for acceptance in each iteration when gets large. Therefore, we use a rejectionacceptance sampling scheme in th cycle of the SEM algorithm at the th iteration of the proposed EM algorithm to generate from the posterior distribution of given and ; for . For this, we notice that and the pdf is bounded by some value independent of , i.e.,
So, the rejectionacceptance sampling scheme to generate from is given by the following.

Generate a sample from , say .

If , then accept as a realization of ; otherwise, start from step 1.
Appendix 3: Evaluating
At th iteration of the proposed EM algorithm, for th observed value , define three matrices such as A, B, and C. Elements of matrix A are independent realizations from and elements of matrix B are coming from .
Assume that and , respectively, are the th element of the th row of matrices A and B, then the th element of the th row of matrix C, i.e., is . Now,
Also, is approximated as
It should be noted that the constant K must be large enough. Using , the quantities ; for and , are approximated very accurately.
References
 [1] Broniatowski, M., Celeux, G., and Diebolt, J. (1983). Reconnaissance de Mélanges de densités par un algorithme dápprentissage probabiliste, Data analysis and informatics, 3, 359374.

[2]
Buckle, D. J. (1995). Bayesian inference for stable distributions,
Journal of the American Statistical Association, 90(430), 605613.  [3] Celeux, G. and Diebolt, J. (1985). The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for mixture problem, Computational statistics quarterly, 2 (1), 7382.
 [4] Chambers, J., Mallows, C., and Stuck, B. (1976). A method for simulating stable random variables, Journal of the American Statistical Association, 71, 340–344.
 [5] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society Series B, 39, 138.
 [6] Ip, E. H. S. (1994). A stochastic EM estimator for handling missing data, Unpublished Doctoral dissertation, Department of Statistics, Stanford University.
 [7] Kogon, S. M. and Williams, D. B. (1998). Characteristic function based estimation of stable parameters, In: Adler R, Feldman R, Taqqu M, Eds. A Practical Guide to Heavy Tailed Data. Boston: Birkhäuser, 311338.
 [8] Little, R. J. A., and Rubin, D.B. (1983). Incomplete data, in Encyclopedia of Statistical Sciences, S. Kotz and N.L. Johnson, eds., Vol. 4, John Wiley, New York, pp. 4653.
 [9] Liu, C. and Rubin, D. B. (1994). The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence, Biometrika, 81, 633648.
 [10] Mandelbrot, B. B. (1963). The Variation of Certain Speculative Prices. Journal of Business, 36, 394419.
 [11] Mandelbrot, B. B. and Hudson, R. L. (2007). The Misbehavior of Markets: A Fractal View of Financial Turbulence, Basic Books, Annotated Edition.
 [12] McCulloch, J. H. (1986). Simple consistent estimators of stable distribution parameters. Communications in StatisticsSimulation and Computation, 5, 11091136.
 [13] McLachlan, G. J. and Krishnan, T. (2008). The EM Algorithm and Extensions, second edition, John Wiley.
 [14] Meng, X. L. and Rubin, D. B. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework, Biometrika, 80, 267278.
 [15] Mittnik, S. and Paolella, M. S. (2003). Prediction of financial downside risk with heavy tailed conditional distributions, In: Rachev, S.T. (Ed.), Handbook of Heavy Tailed Distributions in Finance, Elsevier Science, Amsterdam.
 [16] Nolan, J. P. (1998). Parameterizations and modes of stable distributions, Statistics and Probability Letters, 38, 187195.
 [17] Nolan, J. P. (2001). Maximum likelihood estimation of stable parameters, In BarndorffNielsen, O. E., Mikosch, T., and Resnick, I. (Eds.), Lévy Processes: Theory and Applications, 379400, Birkhöuser, Boston.
 [18] Nolan, J. P. (2003). Modeling financial distributions with stable distributions, Volume 1 of Handbooks in Finance, Chapter 3, pp. 105–130. Amsterdam: Elsevier.
 [19] Nolan, J. P. (2013). Multivariate elliptically contoured stable distributions: theory and estimation, Computational Statistics, 28(5), 20672089.
 [20] Ortobelli, S., Rachev, S. T., and Fabozzi, F. J. (2010). Risk management and dynamic portfolio selection with stable Paretian distributions. Journal of Empirical Finance, 17(2), 195211.
 [21] R Core Team (2016). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.Rproject.org/.
 [22] Rachev, S. T. (2003). Handbook of Heavy Tailed Distributions in Finance. Amsterdam: Elsevier.
 [23] Rachev, S. T., Menn, C., and Fabrizzi, F. J. (2005). Fattailed and Skewed Asset Return Distributions: Implications for Risk Management, Portfolio Selection, and Option Pricing, John Wiley, Hoboken, New Jersey.
 [24] Rachev, S. T. and Mittnik, S. (2000). Stable Paretian Models in Finance, John Wiley, Chichester.
 [25] Samorodnitsky, G. and Taqqu, M. S. (1994). Stable NonGaussian Random Processes: Stochastic Models and Infinite Variance, Chapman and Hall, London.
 [26] Zolotarev, V. M. (1986). OneDimensional Stable Distributions, American Mathematical Society, Providence, R. I.
Comments
There are no comments yet.