The estimation of income distributions has been widely recognized for playing an important role in the measurement of inequality, poverty and welfare over time and space. The comprehensive contents regarding the different parametric models for income distributions and and various estimation methods are available inKleiber and Kotz (2003) and Chotikapanich (2008). Information on income are generally obtained by household expenditure and income surveys in many countries. However, the availability of raw data is usually severely limited because of several problems such as the difficulty in data management and confidentiality of individual income data. Instead, the income data is typically provided as summary statistics (e.g. mean income and Gini index) and grouped data in the form of frequency distributions over some predefined income classes. Based on such limited data, many statistical methods to estimate income distributions have been proposed so far (e.g. Chotikapanich et al., 2007, 2012; Hajargasht et al., 2012; Griffiths and Hajargasht, 2015; Kleiber and Kotz, 2003; Wu and Perloff, 2007). A standard approach for estimating income distributions is to assume a parametric family of income distributions to approximate the true income distribution and estimate its unknown parameters based on the (limited) data. There exists a wide variety of families of distributions available (e.g. Hajargasht and Griffiths, 2013; Kleiber and Kotz, 2003; McDonald, 1984; McDonald and Xu, 1995; Singh and Maddala, 1979). The existing approaches predominantly focus on the income distribution using the data only from a single area, e.g. country or state, in a single period. Although there is some development in the direction of estimating a series of income distributions using time series data (Nishino et al., 2012; Nishino and Kakamu, 2015), there exists no work which estimates income distributions of multiple areas simultaneously taking spatial dependencies into account.
Estimating income distributions and associated inequality measures over multiple local areas such as municipalities and comparing them is particularly crucial to capture local economic status for local policy making. Even in this multiple income distribution scenario, the conventional methods would still separately estimate the area-wise income distributions based on area-wise grouped data. The separate estimation would be reasonable if the study is on a national scale such as Chotikapanich et al. (2007). However, the conventional methods would encounter the following problems. One of the biggest problems is that the sample sizes can be very small in some local areas because of small populations and for such areas the estimates of the income distributions can be inaccurate and unstable. Furthermore, there are situations where a survey does not take sample from some of the areas. This is particularly the case for the grouped data obtained from Housing and Land Survey (HLS) of Japan in 2013 where 634 out of 1899 municipalities in Japan were not sampled. It is impossible to infer the income distributions for the non-sampled areas unless there is a statistical model that connects the sampled areas and non-sampled areas. Moreover, although the income distributions of the neighboring local areas would exhibit similar tendency, the conventional estimates would exhibit a large variation among the neighboring areas due to the aforementioned reason. It would be natural to take account of geographical information. As these issues cannot be addressed by simply using the separate estimation, a more sophisticated method is required.
In this paper, we propose a simultaneous estimation and inference method for area-wise spatial income distributions taking account of geographical relationship from grouped data. Specifically, we assume parametric families of income distributions with different parameter values over different areas and introduce a state-space model with a spatial stochastic structure for the spatially varying parameters. Since the propose method takes account of geographical information through a latent stochastic structure, the estimation accuracy in areas with small sample sizes can be improved by borrowing strength from adjacent areas. Furthermore, income distributions in non-sampled areas can be estimated using the information from adjacent areas, which would be a significant advantage over the conventional separate estimation method. Also, the latent structure tends to produce similar parameter values for adjacent areas, so that the resulting estimated income distributions exhibit the same tendency with the improved interpretability. We develop a Bayesian estimation approach to the proposed model and provide the posterior computation algorithm based on Markov Chain Monte Carlo (MCMC), which facilitates the estimation as well as inference not only for area-wise parameters but also important summary measures such as mean incomes and Gini indices.
This paper is organized as follows. Section 2, introduces the proposed method and provide Bayesian estimating procedures. Section 4, the proposed method is demonstrated to HLS data to estimate the income distributions across Japan and confirms that it produces the spatially smoothed estimates of the area-wise means and Gini indices. In Section 3, we conduct simulation studies to compare the performance of the proposed and the separate estimation method. Finally, some conclusion and discussion are given in Section 5.
2 Estimation and inference for area-wise income distributions
2.1 Grouped income data and multinomial likelihood
Suppose there are areas and we are interested in the income distributions . Instead of individual incomes, we can only observe the grouped data which comprise the number of individuals in the th income group for where is the number of income groups. The income groups are defined as , with and . In this paper, we assume that all the income distributions have the same parametric form but they could have different parameter values. Let
be a vector of parameters that characterizes, that is,
. There exist several parametric distribution families for income models, e.g. the generalized beta distribution and its special and limiting cases the beta-2, Singh-Maddala, Dagum, generalized gamma distribution and lognormal distribution. Once the parametric distribution is specified, the summary measures such as the mean and variance as well as the inequality measures such as Lorenz curve and Gini index can be immediately obtained. In Table 1 inHajargasht et al. (2012), the several parametric distributions and forms of the important measures are summarized. The dimension of the area-wise parameter depends on the choice of the parametric distribution for and typically ranges from 2 to 4. The standard approach to estimating from the grouped data in the th area is based on the multinomial likelihood function which is proportional to
By maximizing the above function with respect to , the maximum likelihood estimator of is obtained.
In order to carry out the simultaneous estimation and inference for the multiple ’s, we treat them as latent variables and introduce a stochastic latent structure for them. It should be noted that the parameter spaces of each element of can vary depending on the choice of . Hence, we first assume that there exists a function such that can be expressed as for some . This assumption is quite reasonable since it is satisfied by all the distributions given in Table 1 of Hajargasht et al. (2012)
. For instance, when we use the log-normal distribution with the mean parameterand variance parameter , a reasonable choice of and would be and . Since ’s are on the whole real line, we can readily introduce spatial correlation among .
2.2 Latent models for area-wise parameters
In order to express the area-wise heterogeneity and spatial similarity (parameter values would be similar for adjacent areas), we introduce the latent model that defines a structure among area-wise parameters. To this end, the stochastic structures for both the area-wise parameters themselves and differences of the parameters are introduced. Specifically, we assume that the following prior for :
independently for where is equal to one if the area is adjacent the area () and zero otherwise (). The first part has a role to shrink towards the grand mean and the second part shrinks the difference between and of two adjacent areas toward . The specification of the second part is included in the class of pair-wise difference priors (Besag et al., 1995) whose form is the same as the well-known conditional autoregressive prior (Besag, 1974; Besag and Kooperberg, 1995). In (2), and are precision parameters in the first and second parts, and is the normalizing constant relevant to the precision parameters, that is, with , where , is the number of adjacent areas and is an matrix with the th entry equal to .
The specification of the latent stochastic structure of the area-wise parameter () given in (2) combined with the multinomial likelihood functions (1) gives the state-space model for the grouped data. The unknown structural parameters in the proposed model are the grand mean and precision parameters and for . We develop an efficient Bayesian approach to estimate the area-wise parameter by assigning prior distributions for the unknown parameters.
2.3 Posterior computation
We consider the conditionally conjugate priors for the unknown parameters:, and with and specified by the analyst. In this paper, we set and
as default choice. The likelihood associated with the joint distribution ofand is given by
where with defined in (1). We now outline the posterior sampling algorithm for the proposed model. The details are presented later, but in brief, the algorithm proceeds by repeating the following four steps:
For , update the -dimensional vector of the latent area-wise parameters using the independent Metropolis-Hastings (MH) algorithm. The detail is provided below.
For , update the overall mean from the full conditional posterior where , .
For , update the precision parameters using the Metropolis adjusted Langevin algorithm whose details are provided below.
In Step 1, we need to sample from the full conditional distribution of which is proportional to
This form is not in a standard form due to the complicated multinomial likelihood of the form (1). In order to carry out independent MH algorithm efficiently, we construct a proposal distribution by approximating the full conditional distribution by a normal distribution. To this end, is approximated by the -dimensional normal density function with the mean vector and precision matrix set to the mode and negative Hessian matrix at the mode of denoted by and , respectively. Then the full conditional distribution proportional to (3) can be approximated by , where and with , and . A proposal is generated from this approximated full conditional distribution independent to the current value
and is accepted with the probability.
In Step-3, with the current values and , the candidates of and are generated from the following Langevin diffusion:
where , and is the negative log-posterior distribution of . The partial derivatives of are obtained as
where and are prior distribution for and , respectively. The proposed values are accepted with the probability with
where is a multivariate normal density with the mean vector and variance-covariance matrix , and includes the relevant terms of the full conditional distribution given by
Based on the posterior samples of
, the estimates of the area-wise income distributions and important summary measures such as area-wise means and Gini indices can be computed. The uncertainty quantification (e.g. the posterior standard error and credible interval) of the estimates can be easily carried out based on the posterior samples. Moreover, we can easily incorporate the sampling steps of the areal parameters for non-sampled areas. Letdenotes the index of the non-sampled area, so that represents the parameters of the income distribution in the area . Since there is no data in the area , in (3) does not exist and the full conditional distribution of is simply with , , if and 0 otherwise, and , the number of areas adjacent to the area .
2.4 Laplace-type distributions for pair-wise difference
While the pair-wise difference prior in the form given in (2) is widely used and can produce spatially smoothed estimates of parameters, we may use other forms of pair-wise difference to achieve the following. When the income distributions of the two adjacent areas are quite similar with the similar values of and with , setting would be more interpretable. Moreover, when the difference between and for is sufficiently large, the pair-wise difference prior (2) based on the normal distribution might produce over-smoothed estimates of these parameters. In order to address these issues, we introduce the following latent structure:
where is the normalizing constant. Note that the second part will be included in the general pair-wise difference prior given in Besag et al. (1995) when . For general , some similar forms have appeared in Bayesian group Lasso (Xu and Ghosh, 2015) and Bayesian elastic net (Li and Lin, 2010). We use the same gamma prior for : . Using the identity,
where is the set of ’s and the distribution of is set such as the joint distribution of and is the following form:
for . Since the joint distribution of is the multivariate normal given ’s, which makes the posterior computation much easier, that is, sampling from the full conditional posterior of can be carried out in almost the same way as used in Section 2.3. Based on the above hierarchical expression, can be expressed as
where is the density function of , that is, , , is an matrix whose -element is for
and , row sum of . The integral can be evaluated via simple Monte Carlo method generating a number of random samples of from .
The full conditional distribution of using the hierarchy is given by
so that we can use the similar sampling strategy used in the previous section. The sampling algorithm for the Laplace-type prior is given as follows.
Generate samples of via random-walk MH algorithm where a candidate value of is generated from with the current value and step size and is accepted with the probability where and is the full conditional distribution given by
and is the prior distribution of .
For , update the -dimensional vector of the latent area-wise parameters using the independent MH algorithm. The proposal is first generated from where and and is accepted with the probability for the current value .
Generate samples of in the same way as Section 2.3.
For the non-sample area , again, the posterior samples of parameters can be generated. For the current value , first generate for is generated in the same way as Step 2 and define as for and otherwise. Then is updated by sampling from where and .
3 Simulation studies
Here the performance of the proposed methods is investigated via the Monte Carlo simulations. In this study, we use the log-normal distribution as the underlying income distributions where the true parameters vary over areas. The number of areas is set to
. To generate a synthetic dataset, we first generated the latitude and longitude of the areas from the uniform distribution on. Based on the location , we constructed an adjacent matrix whose diagonal elements are and -element () is 1 if and 0 otherwise. The average number of adjacent areas is about , which is similar to that of the adjacent matrix of the Japanese municipalities used in Section 4. In the log-normal distribution, the mean and variance permeates are defined as and , respectively, where is the (transformed) area-wise parameter on . We consider a situation where the area-wise parameters relatively smoothly change over the locations as observed in the results in Section 4. Specifically, the following three scenarios for are used :
where and . In Scenario (A), both mean and variance parameters change smoothly over areas, whereas in Scenario (B) the mean parameter has two discontinuous regions in which the mean parameter can be smaller or larger by 0.3. In Scenario (C), on the other hand, the areas are divided into four groups and the areas in the same groups have the same parameter values. For each area , we generated samples from the log-normal distribution with the mean and variance where was randomly generated from the discrete uniform distribution on . Then the grouped data is constructed with the boundaries .
The proposed pair-wise difference (PWD) and pair-wise Laplace (PWL) priors are applied to the simulated dataset. The default values of the hyperparameters are used for both methods and we generated 2000 posterior samples of’s after discarding 500 samples as a burn-in period. The posterior means of ’s are computed as the point estimates of ’s and credible intervals of are constructed based on the posterior samples. For comparison, we also considered the area-wise maximum likelihood (AML) method which estimates the area-wise parameters by separately maximizing the multinomial likelihood as a crude alternative for estimating .
To evaluate the performance in the point estimation and interval estimation, we calculated the mean squared errors , coverage probability , and average length of the credible intervals with and where is a point estimate and is a credible interval in the th iteration. Figures 1 and 2 report the boxplots of , and for , respectively. From the reported MSE, the proposed PWD and PWL clearly outperform the crude AML in terms of point estimation. Moreover, both PWD and PWL produced some reasonable credible intervals whose coverage probabilities are approximately or larger than the nominal level of
whereas AML produced inefficient confidence intervals that tend to be wider with short coverage. Comparing PWD and PWL in terms of MSE, PWD performs better than PWL in both Scenario (A) while the opposite result was obtained in Scenario (C). Note that the whole regions are divided into the finite number of groups in Scenario (C). In such a case, PWL method seems to work better than PWD, which is consistent to the motivation of using PWL. In terms of interval estimation, the coverage probabilities of PWL are generally close to the nominal level and the average lengths of PWL are smaller than PWD in all cases, which may reflect the efficiency of the Laplace formulation in PWL. The result would imply that PWL is more efficient than PWD in terms of interval estimation.
4 Application to Japanese municipality-wise grouped income data
The proposed method is demonstrated using the Japanese municipality-wise grouped data where we are interested in the spatial distribution of the local economic status of 1897 municipalities in Japan. The grouped income data is obtained from Housing and Land Survey (HLS) of Japan in 2013. Our dataset contains information on the number of working households that fall in the income class in 1263 municipalities out of 1897 municipalities. Note that the remaining 634 municipalities are not sampled in this data. The group boundaries are given by (million yen). Since the exact sample size in each area is not known, the present study assumes that of populations are sampled, which could be a conservative value since most survey have larger sample sizes. See Kawakubo and Kobayashi (2019) for more details of the HLS data.
First the AML method is applied using the log-normal distributions, LN. Then the average income, , and Gini index, in each area are computed based on the maximum likelihood estimates of and . The results are shown in Figure 6 where the estimated values under AML are highly variable over the areas. Since some areas with small populations have very small sample sizes, the maximum likelihood estimates based only on the area-specific grouped data are unstable and inaccurate. Moreover, there would be no reasonable way for the AML method to estimate the parameters in the non-sampled areas, hence the practicality of the AML method is clearly limited.
In order to improve the estimation accuracy as well as to estimate the means and Gini indices in both sampled and non-sampled municipalities, we apply the six proposed methods based on the combinations of the popular three parametric income distributions, log-normal (LN), Singh-Madala (SM) and Dagum (DG) distributions, and the two different latent distributions, PWD and PWL priors. Note that the density function of the SM and DG distributions are given by
where and are shape parameters and is a scale parameter. We used for the parameter transformation in the proposed model. With the default choices of the hyperparameters, we generated 5000 posterior samples of the area-wise parameters and structural parameters after discarding the first 500 samples.
The models are compared based on the following quantities mimicking posterior predictive loss(Gelfand and Ghosh, 1998):
where is the number of households belong to the th group in the th area, and respectively are the mean and variance of the posterior predictive distribution for under model . Table 1 presents the values of PPL for the six models. It is observed that the SM distribution would be preferable choice among the three candidate distributions. The SM distribution is known to provide good fit to income data in many countries especially where the income distributions are more characterized by the upper tails than the lower tails (Kleiber and Kotz, 2003; Kakamu, 2019). Our result in which the SM distributions is supported by the data agrees with the past findings (McDonald, 2008; Atoda et.al., 1988). On the other hand, the DG distribution which has more rich characteristics in the lower tail may not be suitable for the Japanese income data. Comparing the two latent structures, the values of PWD and PWL are relatively comparable. The lognormal distribution is a simple income model, but it is known to provide poor fit in many cases due to its limitation in flexibility. In what follows, we will focus on the results of the proposed method based on the SM distribution in what follows.
We reported the posterior mean and credible intervals of the structural parameters in Table 2, which shows that the results for the grand means of area-wise parameters are almost the same between PWD and PWL. Comparing the values of ’s and ’s, there would be some spatial correlations among municipalities as we expected.
Based on the posterior samples of area-wise parameters, the posterior means of the area-wise means and Gini indices in the sampled areas are computed. Figure 3 shows the differences in the estimates of the average incomes and Gini indices between the proposed PWD and AML methods against the sample sizes in the sampled areas. It is observed that the proposed method produces almost the same estimates as AML in areas with large samples while a difference of estimates gets larger according to the sample size. The figure also indicates that in the areas with the small and moderate sample sizes, AML tends to produce the estimates slightly smaller than those under the proposed SM-PWD. In order to see the uncertainty of the estimates based on the proposed method, we calculated the 95% credible intervals of the average incomes and Gini indices and the lengths of the intervals. Figure 4 confirms that the proposed method produces the credible intervals with the reasonable lengths in all sampled areas. The figure also shows that the intervals becomes wider (uncertainty gets larger) as the sample sizes decrease. We next computed the area-specific density functions under SM-PWD using the posterior mean of the area-specific parameters of the SM distribution. We selected 5 areas whose average incomes are ranked at the first, 100th, 200th, 500th and 1000th in descending order. The density of the national income distribution is also computed using the global parameters. The six estimated densities are presented in Figure 5. It is confirmed that the estimated area-wise income distributions vary considerably over the areas and are significantly different from the national income distribution in some areas. Especially, the estimated income distribution in Minato area, which is located in central Tokyo and has the largest estimated average income, exhibits much heavier tail than those of the other areas such as Mizumaki area in northern Kyushu whose estimated average income is the lowest among the five areas.
We now focus on the average incomes and Gini indices in all areas (both sampled and non-samples areas). We additionally generated 5000 samples of the area-wise parameters in the non-sampled areas and computed the posterior means. The spatial distribution of the estimates based on the proposed methods are presented in Figure 6. The figure shows that the proposed methods can produce spatially smoothed estimates of the average incomes and Gini indices not only in sampled areas but also non-sampled areas. This is the greatest advantage of the proposed method over the crude AML method. Comparing the results under the two different pair-wise priors, it is observed that PWL tends to produce spatially more smoothed estimates than PWD, and PWL detects some local hotspots. These phenomena may come from the properties of PWL; it forces income distributions in adjacent areas to be the same when a difference between the parameter values is small while it does not shrink the difference so much when the difference is significantly large.
|# area-wise parameters||2||3||3|
|State space distribution||PWD||PWL||PWD||PWL||PWD||PWL|
|Posterior predictive loss||1479||1479||679||676||705||2473|
5 Conclusion and Discussion
We have developed an effective method for estimating the area-wise spatial income distributions taking account of the geographical relationship. Through the demonstration using the real and simulated datasets, the proposed method can produce spatially smoothed estimates of the income distributions and outperforms the crude area-wise separate method.
While we are concerned only with the spatially varying income distributions in the present study, the grouped income data are available over different time periods, in which case, it would be desirable to estimate patio-temporal income distributions at the same time. Our method can be generalized to a spatial-temporal method by introducing serial correlation of area-wise parameters, which would be an interesting future work. Moreover, if we have additional information (e.g. sample mean) other than the grouped data, using the information would improve the estimation accuracy of parameters. We may use the generalized method of moments (GMM) as considered inHajargasht et al. (2012), especially the Bayesian GMM method (Yin, 2006) to use in the framework of the proposed method. The detailed investigation is left to a future study.
This work was supported by JSPS KAKENHI Grant Numbers 18K12754, 18K12757, 19K13667 and Japan Center for Economic Research.
- Atoda et.al. (1988) Atoda, N., Suruga, T. and Tachibanaki, T. (1988). Statistical inference of functional forms for income distribution. The Economic Studies Quarterly, 39, 14-40.
- Besag (1974) Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society: Series B, 36, 192-236.
- Besag and Kooperberg (1995) Besag, J. and Kooperberg, C. (1995). On conditional and intrinsic autoregression. Biometrika, 82, 733-746.
- Besag et al. (1995) Besag, J., Green, P., Higdon, D. and Mengersen, K. (1995). Bayesian computation and stochastic systems. Statistical Science, 10, 3-41.
- Chhikara and Folks (1989) Chhikara, R. S. and Folks, L. (1989). The Inverse Gaussian Distribution: Theory, Methodology, and Applications, New York, Marcel Dekker.
- Chotikapanich (2008) Chotikapanich, D. (ed.) (2008). Modeling Income Distributions and Lorenz Curves, New York, Springer.
- Chotikapanich and Griffiths (2002) Chotikapanich, D. and Griffiths, W.E. (2002). Estimating Lorenz curves using a Dirichlet distribution. Journal of Business & Economic Statistics, 20, 290-295.
- Chotikapanich et al. (2007) Chotikapanich, D., Griffiths, W. E., and Rao, D. S. P. (2007). Estimating and combining national income distributions using limited data. Journal of Business and Economic Statistics, 25, 97-109.
- Chotikapanich et al. (2012) Chotikapanich, D., Griffiths, W. E., Rao, D. S. P., and Valencia, V. (2012). Global income distributions and inequality, 1993 and 2000: incorporating country-level inequality modelled with beta distributions. The Review of Economics and Statistics, 94, 52-73.
- Chotikapanich and Griffiths (2000) Chotikapanich, D. and Griffiths, W.E. (2000). Posterior distributions for the Gini coefficient using grouped data. Australian and New Zealand Journal of Statistics, 42, 383-392.
- Gelfand and Ghosh (1998) Gelfand, A. E. and Ghosh, S. K. (1998). Model choice: a minimum posterior predictive loss approach. Biometrika, 85, 1-11.
- Griffiths and Hajargasht (2015) Griffiths, W. and Hajargasht, G. (2015). On GMM estimation of distributions from grouped data. Economics Letters, 126, 122-126.
- Hajargasht and Griffiths (2013) Hajargasht, G. and Griffiths, W. (2013). Pareto-lognormal distributions: Inequality, poverty, and estimation from grouped income data. Economic Modeling, 33, 593-604.
- Hajargasht et al. (2012) Hajargasht, G., Griffiths, W.E., Brice, J., Rao, D.S.P. and Chotikapanich, D. (2012). Inference for income distributions using grouped data. Journal of Business & Economic Statistics, 30, 563-575.
- Kakamu (2019) Kakamu, K. (2019). Simulation Studies Comparing Dagumand Singh–Maddala Income Distribution. Computational Economics, 48: 593-605.
- Kawakubo and Kobayashi (2019) Kawakubo, Y. and Kobayashi, G. (2019). Small area estimation for grouped data. arXiv:1903.07239.
- Kleiber and Kotz (2003) Kleiber, C., and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences, New York, Wiley.
- Li and Lin (2010) Li, Q. and Lin, N. (2010). The Bayesian elastic net. Bayesian Analysis, 5, 151-170.
- McDonald (1984) McDonald, J.B. (1984). Some generalized functions for the size distribution of income. Econometrica, 52, 647-663.
- McDonald (2008) McDonald, J.B. (2008). Some Generalized Functions for the Size Distribution of Income. In Modeling Income Distributions and Lorenz Curves (ed. Chotikapanich, D.), New York, Springer.
- McDonald and Xu (1995) McDonald, J.B. and Xu, Y. J. (1995). A generalization of the beta distribution with applications. Journal of Econometrics, 66, 133-152.
- Nishino et al. (2012) Nishino, H., Kakamu, K., Oga, T., (2012). Bayesian estimation of persistent income inequality by lognormal stochastic volatility model. Journal of Income Distribution, 21, 88-101.
- Nishino and Kakamu (2015) Nishino, H. and Kakamu, K. (2015). A random walk stochastic volatility model for income inequality. Japan and the World Economy, 36, 21-28.
- Park and Casella (2008) Park, T. and Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association, 103, 681-686.
- Singh and Maddala (1979) Singh, S.K. and Maddala, G.S. (1976). A function for size distribution of income. Econometrica, 47, 1513-1525.
- Wu and Perloff (2007) Wu, X. and Perloff, J. (2007). GMM estimation of a maximum entropy distribution with interval data. Journal of Econometrics, 138, 532-546.
- Xu and Ghosh (2015) Xu, X. and Ghosh, M. (2015). Bayesian variable selection and estimation for group Lasso. Bayesian Analysis, 10, 909-936.
- Yin (2006) Yin, G. (2006). Bayesian generalized method of moments. (2006). Bayesian Analysis, 4, 191-208.