# On Measuring the Variability of Small Area Estimators in a Multivariate Fay-Herriot Model

This paper is concerned with the small area estimation in the multivariate Fay-Herriot model where covariance matrix of random effects are fully unknown. The covariance matrix is estimated by a Prasad-Rao type consistent estimator, and the empirical best linear un- biased predictor (EBLUP) of a vector of small area characteristics is provided. When the EBLUP is measured in terms of a mean squared error matrix (MSEM), a second-order approximation of MSEM of the EBLUP and a second-order unbiased estimator of the MSEM is derived analytically in closed forms. The performance is investigated through numerical and empirical studies.

## Authors

• 5 publications
• 10 publications
04/26/2018

### Corrected Empirical Bayes Confidence Region in a Multivariate Fay-Herriot Model

In the small area estimation, the empirical best linear unbiased predict...
04/26/2018

### Empirical Best Linear Unbiased Predictors in Multivariate Nested-Error Regression Models

For analyzing unit-level multivariate data in small area estimation, we ...
04/12/2022

### Portfolio Optimization Using a Consistent Vector-Based MSE Estimation Approach

This paper is concerned with optimizing the global minimum-variance port...
03/20/2021

### Properties of point forecast reconciliation approaches

Point forecast reconciliation of collection of time series with linear a...
01/24/2019

### Multi-Goal Prior Selection: A Way to Reconcile Bayesian and Classical Approaches for Random Effects Models

The two-level normal hierarchical model has played an important role in ...
04/14/2020

### Extensions of Random Orthogonal Matrix Simulation for Targetting Kollo Skewness

Modelling multivariate systems is important for many applications in eng...
04/27/2019

### Ready-to-Use Unbiased Estimators for Multivariate Cumulants Including One That Outperforms x^3

We present multivariate unbiased estimators for second, third, and fourt...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Mixed effects models and their model-based estimators have been recognized as a useful method in statistical inference. In particular, small area estimation is an important application of mixed effects models. Although direct design-based estimates for small area means have large standard errors because of small sample sizes from small areas, the empirical best linear unbiased predictors (EBLUP) induced from mixed effects models provide reliable estimates by “borrowing strength” from neighboring areas and by using data of auxiliary variables. Such a model-based method for small area estimation has been studied extensively and actively from both theoretical and applied aspects, mostly for handling univariate survey data. For comprehensive reviews of small area estimation, see Ghosh and Rao (1994), Datta and Ghosh (2012), Pfeffermann (2013) and Rao and Molina (2015).

When multivariate data with correlations are observed from small areas for estimating multi-dimensional characteristics, like poverty and unemployment indicators, Fay (1987) suggested a multivariate extension of the univariate Fay-Herriot model, called a multivariate Fay-Herriot model, to produce reliable estimates of median incomes for four-, three- and five-person families. Fuller and Harter (1987) also considered a multivariate modeling for estimating a finite population mean vector. Datta, Day and Basawa (1999) provided unified theories in empirical linear unbiased prediction or empirical Bayes estimation in general multivariate mixed linear models. Datta, Day and Maiti (1998) suggested a hierarchical Bayesian approach to multivariate small area estimation. Datta, (1999) showed the interesting result that the multivariate modeling produces more efficient predictors than the conventional univariate modeling. Porter, Wikle and Holan (2015) used the multivariate Fay-Herriot model for modeling spatial data. Ngaruye, von Rosen and Singull (2016) applied a multivariate mixed linear model to crop yield estimation in Rwanda.

Although Datta, (1999) developed the general and unified theories concerning the empirical best linear unbiased predictors (EBLUP) and their uncertainty, it is definitely more helpful and useful to provide concrete forms with closed expressions for EBLUP, the second-order approximation of the mean squared error matrix (MSEM) and the second-order unbiased estimator of the mean squared error matrix. Recently, Benavent and Morales (2016) treated the multivariate Fay-Herriot model with the covariance matrix of random effects depending on unknown parameters. As a structure in the covariance matrix, they considered diagonal, AR(1) and the related structures and employed the residual maximum likelihood (REML) method for estimating the unknown parameters embedded in the covariance matrix. A second-order approximation and estimation of the MESM were also derived. For some examples, however, we cannot assume specific structures without prior knowledge or information on covariance matrices.

In this paper, we treat the multivariate Fay-Herriot model where the covariance matrix of random effects is fully unknown. This situation has been studied by Fay (1987), Fuller and Harter (1987), Datta, (1998), and useful in the case that statisticians have little knowledge on structures in correlation. As a specific estimator of the covariance matrix, we employ Prasad-Rao type estimators with closed forms and use the modified versions which are restricted over the space of nonnegative definite matrices. The empirical best linear unbiased predictors are provided based on the Prasad-Rao type estimators, and second-order approximation of their mean squared error matrices and their second-order unbiased estimators of the MSEM are derived with closed expressions. These are multivariate extensions of the results given by Prasad and Rao (1990) and Datta, (2005) for the univariate case.

The paper is organized as follows: Section 2 gives the Prasad-Rao type estimators and their nonnegative-definite modifications for the covariance matrix of the random effects, and shows their consistency. In Sections 3 and 4, the second-order approximation of MSEM of EBLUP and the second-order unbiased estimator of the MSEM are derived in closed forms. The performance of EBLUP and the MSEM estimator are investigated in Section 5. This numerical study illustrates that the proposals have good performances for the low-dimensional case. However, a covariance matrix has parameters, and we need more data so as to maintain the performances of the proposals for higher-dimensional cases.

Finally, it is noted that empirical best linear unbiased predictors for small area means are empirical Bayes estimators and related to the so-called James-Stein estimators. In this sense, the prediction in the multivariate Fay-Herriot model corresponds to the empirical Bayes estimation of a mean matrix of a multivariate normal distribution, which is related to the estimation of a precision matrix from a theoretical aspect as discussed in Efron and Morris (1976). In this framework, several types of estimators are suggested for estimation of the precision matrix, and it may be an interesting query whether those estimators provide improvements in the multivariate small area estimation.

## 2 Empirical Best Linear Unbiased Prediction

In this paper, we assume that area-level data are observed, where is the number of small areas, is a -variate vector of direct survey estimates and is a matrix of covariates associated with for the -th area. Then, the multivariate Fay-Herriot model is described as

 \boldmathyi=\boldmathXi\boldmathβ+\boldmathvi+\boldmathεi,i=1,…,m, (1)

where is an -variate vector of unknown regression coefficients, is a -variate vector of random effects depending on the -th area and is a -variate vector of sampling errors. It is assumed that and are mutually independently distributed as

 \boldmathvi∼Nk(\boldmath0,% \boldmathΨ)and\boldmathεi∼Nk(\boldmath0,\boldmathDi),

where is a unknown and nonsingular covariance matrix and are known covariance matrices. This is a multivariate extension of the so-called Fay-Herriot model suggested by Fay and Herriot (1979).

For example, we consider the crop data of Battese, Harter and Fuller (1988), who analyze the data in the nested error regression model. For the -th county, let and be survey data of average areas of corn and soybean, respectively. Also let and be satellite data of average areas of corn and soybean, respectively. In this case, , and correspond to

 \boldmathyi=(yi1,yi2)⊤,\boldmathXi=(1xi1xi20000001xi1xi2),\boldmathβ=(β1,…,β6)⊤

for and . Battese, (1988) applied a univariate nested error regression model for each of and , while we can use the multivariate model (1) for analyzing both data simultaneously.

We now express model (1) in a matrix form. Let , , and . Then, model (1) is expressed as

 \boldmathy=\boldmathX\boldmathβ+% \boldmathv+\boldmathε, (2)

where and for .

For the -th area, we want to predict the quantity , which is the conditional mean given . A reasonable estimator can be derived from the conditional expectation . The conditional distribution of given and the marginal distribution of are

 \boldmathvi∣\boldmathyi∼Nk(\boldmathv∗i(% \boldmathβ,\boldmathΨ),(\boldmathΨ−1+\boldmathD−1i)−1),\boldmathyi∼Nk(% \boldmathXi\boldmathβ,\boldmathΨ+% \boldmathDi),i=1,…,m, (3)

where

 \boldmathv∗i(\boldmathβ,\boldmathΨ% )=\boldmathΨ(\boldmathΨ+\boldmathDi)−1(\boldmathyi−\boldmathXi\boldmathβ)={\boldmathIk−\boldmathDi(% \boldmathΨ+\boldmathDi)−1}(\boldmathyi−\boldmathXi\boldmathβ). (4)

Thus, we get the estimator

 \boldmathθ∗a(\boldmathβ,\boldmathΨ)= \boldmathXa\boldmathβ+E[% \boldmathva∣\boldmathya]=\boldmathXa\boldmathβ+\boldmathv∗a(\boldmathβ% ,\boldmathΨ) = \boldmathya−\boldmathDa(% \boldmathΨ+\boldmathDa)−1(\boldmathya−\boldmathXa\boldmathβ),

which corresponds to the Bayes estimator of in the Bayesian framework.

When is known, the maximum likelihood estimator or generalized least squares estimator of is

 ˆ\boldmathβ(\boldmathΨ)= {\boldmathX⊤(\boldmathIm⊗\boldmathΨ+\boldmathD)−1\boldmathX}−1\boldmathX⊤(\boldmathIm⊗\boldmath% Ψ+\boldmathD)−1\boldmathy = (5)

Substituting into yields the estimator

 ˆ\boldmathθa(\boldmathΨ)=% \boldmathya−\boldmathDa(\boldmathΨ+% \boldmathDa)−1{\boldmathya−\boldmathXaˆ\boldmathβ(\boldmathΨ)}. (6)

Datta, (1999) showed that is the best linear unbiased predictor (BLUP) of . It can be also demonstrated that is the Bayes estimator against the uniform prior distribution of as well as the empirical Bayes estimator as shown above, which is called the Bayes empirical Bayes estimator.

Concerning estimation of , it is noted that for , which implies that

. Substituting the ordinary least squares estimator

into , we get the consistent estimator

 ˆ\boldmathΨ0=1mm∑i=1{(%\boldmath$y$i−\boldmathXi˜\boldmathβ)(\boldmathyi−\boldmathXi˜% \boldmathβ)⊤−\boldmathDi}. (7)

Taking the expectation of , we can see that , where

 Biasˆ\boldmathΨ0(% \boldmathΨ)= 1mm∑i=1\boldmathXi(% \boldmathX⊤\boldmathX)−1{m∑j=1% \boldmathX⊤j(\boldmathΨ+\boldmathDj)\boldmathXj}(\boldmathX⊤\boldmathX)−1\boldmathX⊤i −1mm∑i=1(\boldmathΨ+% \boldmathDi)\boldmathXi(\boldmathX⊤\boldmathX)−1\boldmathX⊤i−1mm∑i=1\boldmathXi(\boldmathX⊤\boldmathX% )−1\boldmathX⊤i(\boldmathΨ+% \boldmathDi). (8)

Substituting into , we get a bias-corrected given by

 ˆ\boldmathΨ1=ˆ\boldmathΨ0−Biasˆ\boldmathΨ0(ˆ\boldmathΨ0). (9)

For notational convenience, we use the same notation for and without any confusion. It is noted that both estimators are not necessarily nonnegative definite. In this case, there exist a orthogonal matrix and a diagonal matrix such that . Let , and let

 ˆ\boldmathΨ+=\boldmathH\boldmathΛ+\boldmathH⊤.

Replace in with the estimator , and the resulting estimator is the empirical Bayes (EB) estimator

 ˆ\boldmathθEBa=ˆ\boldmathθa(ˆ\boldmathΨ+). (10)

To guarantee asymptotic properties of , we assume the following conditions:

(H1) , .

(H2) There exist positive constants and such that and do not depend on and satify for .

(H3)  is nonsingular and converges to a positive definite matrix.

###### Theorem 1

Under conditions (H1)-(H3), the following properties hold for and :

(1) , which means that has the second-order bias, while is a second-order unbiased estimator of .

(2)  and .

(3) The nonnegative defnite matrix is consistent for large , and for any .

Proof.  We begin with writing as

 ˆ\boldmathΨ0−\boldmathΨ= 1mm∑i=1{(\boldmathyi−% \boldmathXi\boldmathβ)(\boldmathyi−% \boldmathXi\boldmathβ)⊤−(\boldmathΨ+\boldmathDi)}+1mm∑i=1\boldmathXi(˜\boldmathβ−\boldmathβ)(˜\boldmathβ−\boldmathβ)⊤\boldmathX% ⊤i −1mm∑i=1(\boldmathyi−% \boldmathXi\boldmathβ)(˜\boldmathβ−\boldmathβ)⊤\boldmathX⊤i−1mm∑i=1\boldmathXi(˜\boldmathβ−\boldmathβ)(\boldmathyi−\boldmathXi\boldmathβ)⊤,

which yields the bias given in (8). It is easy to check that the bias is of order .

For (2), it is noted that is approximated as

 ˆ\boldmathΨ−\boldmathΨ= 1mm∑i=1{(\boldmathyi−% \boldmathXi\boldmathβ)(\boldmathyi−% \boldmathXi\boldmathβ)⊤−(\boldmathΨ+\boldmathDi)}+Op(m−1) = 1mm∑i=1{\boldmathui% \boldmathu⊤i−(\boldmathΨ+\boldmathDi)}+Op(m−1), (11)

where , having . It is here noted that for are mutually independent and for . Then the consistency follows because

under condition (H2). Using condition (H2) and finiteness of moments of normal random variables, we can show that

converges to a multivariate normal distribution, which implies that .

We next verify that . Note that is decomposed as . For , it is noted that

 ˆ\boldmathβ(\boldmathΨ)−\boldmathβ={m∑i=1\boldmathX⊤i(\boldmathΨ+\boldmathDi)−1\boldmathXi}−1m∑i=1\boldmathX⊤i(% \boldmathΨ+\boldmathDi)−1(\boldmathyi−E\boldmathyi). (12)

Then, and this implies . We next evaluate as

 ˆ\boldmathβ(ˆ\boldmathΨ)−\boldmathβ(\boldmathΨ) ={m∑i=1\boldmathX⊤i(ˆ\boldmathΨ+\boldmathDi)−1\boldmathXi}−1m∑i=1\boldmathX⊤i(ˆ\boldmathΨ+\boldmathDi)−1\boldmathyi ={m∑i=1\boldmathX⊤i(ˆ\boldmathΨ+\boldmathDi)−1\boldmathXi}−1m∑i=1\boldmathX⊤i{(ˆ\boldmathΨ+\boldmathDi)−1−(% \boldmathΨ+\boldmathDi)−1}\boldmathyi +[{m∑i=1\boldmathX⊤i(ˆ\boldmathΨ+\boldmathDi)−1% \boldmathXi}−1−{m∑i=1\boldmathX⊤i(\boldmathΨ+\boldmathDi)−1% \boldmathXi}−1]m∑i=1\boldmathX⊤i(\boldmathΨ+\boldmathDi)−1% \boldmathyi =I1+I2. (13)

First, is written as

 I1=−{m∑i=1\boldmathX⊤i(ˆ% \boldmathΨ+\boldmathDi)−1\boldmathXi}−1m∑i=1\boldmathX⊤i(ˆ% \boldmathΨ+\boldmathDi)−1(ˆ\boldmathΨ−\boldmathΨ)(\boldmathΨ+\boldmathDi)−1\boldmathyi, (14)

which is of order , because and . Next, is rewritten as

 I2= −{m∑i=1\boldmathX⊤i(ˆ\boldmathΨ+\boldmathDi)−1\boldmathXi}−1m∑i=1\boldmathX⊤i{(ˆ\boldmathΨ+\boldmathDi)−1−(% \boldmathΨ+\boldmathDi)−1}\boldmathXi ×{m∑i=1\boldmathX⊤i(\boldmathΨ+\boldmathDi)−1\boldmathXi}−1m∑i=1\boldmathX⊤i(\boldmath% Ψ+\boldmathDi)−1\boldmathyi = {m∑i=1\boldmathX⊤i(ˆ\boldmathΨ+\boldmathDi)−1\boldmathXi}−1m∑i=1\boldmathX⊤i(ˆ%\boldmath$Ψ$+\boldmathDi)−1(ˆ\boldmathΨ−\boldmathΨ)(\boldmathΨ+\boldmathDi)−1\boldmathXi ×{m∑i=1\boldmathX⊤i(\boldmathΨ+\boldmathDi)−1\boldmathXi}−1m∑i=1\boldmathX⊤i(\boldmath% Ψ+\boldmathDi)−1\boldmathyi, (15)

which is of order , because , , and . Thus, we have , and it is concluded that .

For (3), let

be eigenvalues of

, and let be eigenvalues of . Then, for ,

 P(^λj<0)=P(^λj−λj<−λj)=P(−(^λj−λj)>λj)≤P(|√m(^λj−λj)|>√mλj).

Note that . It follows from the Markov inequality that for any ,

 P(|√m(^λj−λj)|>√mλj)≤E[{|√m(^λj−λj)|}2K](√mλj)2K=O(m−K),

because from .

## 3 Second-order Approximation of Mean Squared Error Matrix

Uncertainty of the empirical Bayes estimator in (10) is measured by the mean squared error matrix (MSEM), defined as . It is noted that

 ˆ\boldmathθEBa−\boldmathθa={\boldmathθ∗a(\boldmathβ,% \boldmathΨ)−\boldmathθa}+{ˆ% \boldmathθa(\boldmathΨ)−\boldmathθ∗a(\boldmathβ,\boldmathΨ)}+{ˆ\boldmathθEBa−ˆ\boldmathθa(\boldmathΨ)}

and that . The following lemma is useful for evaluating the mean square error matrix.

###### Lemma 1

is independent of or . Also, is independent of .

Proof.  The covariance of and is

 E[ (\boldmathy−\boldmathX˜% \boldmathβ)(ˆ\boldmathβ(\boldmathΨ)−\boldmathβ)⊤]{\boldmathX⊤(% \boldmathIm⊗\boldmathΨ+\boldmathD)−1\boldmathX} =E[(\boldmathy−\boldmathX˜% \boldmathβ)(\boldmathy−\boldmathX% \boldmathβ)⊤](\boldmathIm⊗\boldmathΨ+\boldmathD)−1\boldmathX =[(\boldmathIm⊗\boldmathΨ+\boldmathD)−\boldmathX{\boldmathX⊤(\boldmathIm⊗\boldmathΨ+\boldmathD)−1\boldmathX}−1\boldmathX⊤](% \boldmathIm⊗\boldmathΨ+\boldmathD)−1\boldmathX =\boldmath0.

This implies that is independent of or . It is also noted that

 ˆ\boldmathθEBa−ˆ% \boldmathθa(\boldmathΨ)= −\boldmathDa(ˆ\boldmathΨ+\boldmathDa)−1(\boldmathya−\boldmathXaˆ\boldmathβ(ˆ\boldmathΨ))+\boldmathDa(\boldmathΨ+\boldmathDa)−1(\boldmathya−\boldmathXaˆ% \boldmathβ(\boldmathΨ)) =