# Estimation in a simple linear regression model with measurement error

This paper deals with the problem of estimating a slope parameter in a simple linear regression model, where independent variables have functional measurement errors. Measurement errors in independent variables, as is well known, cause biasedness of the ordinary least squares estimator. A general procedure for the bias reduction is presented in a finite sample situation, and some exact bias-reduced estimators are proposed. Also, it is shown that certain truncation procedures improve the mean square errors of the ordinary least squares and the bias-reduced estimators.

## Authors

• 1 publication
03/11/2018

07/08/2018

### Measurement Errors as Bad Leverage Points

Errors-in-variables is a long-standing, difficult issue in linear regres...
04/03/2019

### Measurement error induced by locational uncertainty when estimating discrete choice models with a distance as a regressor

Spatial microeconometric studies typically suffer from various forms of ...
12/22/2014

### An {l_1,l_2,l_∞}-Regularization Approach to High-Dimensional Errors-in-variables Models

Several new estimation methods have been recently proposed for the linea...
10/18/2021

### On completing a measurement model by symmetry

An appeal for symmetry is made to build established notions of specific ...
06/26/2020

### Prediction in polynomial errors-in-variables models

A multivariate errors-in-variables (EIV) model with an intercept term, a...
06/26/2019

### Control variate selection for Monte Carlo integration

Monte Carlo integration with variance reduction by means of control vari...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Linear regression model with measurement errors in independent variables is of practical importance, and many theoretical and experimental approaches have been studied extensively for a long time. Adcock (1877, 1878) first treated estimation of the slope in a simple linear measurement error model and derived the maximum likelihood (ML) estimator, which nowadays is known as orthogonal regression estimator (see Anderson (1984)). Reiersøl (1950) has investigated identifiability related to possibility of constructing a consistent estimator. For efficient estimation, see Bickel and Ritov (1987) and, for consistent estimation based on shrinkage estimators, see Whittemore (1989) and Guo and Ghosh (2012). A multivariate generalization of univariate linear measurement error model has been considered by Gleser (1981). See Anderson (1984), Fuller (1987) and Cheng and Van Ness (1999) for a systematic overview of theoretical development in estimation of linear measurement error models.

Even though many estimation procedures for the slope have been developed and proposed, each procedure generally has both theoretical merits and demerits. The ML estimator possesses consistency and asymptotic normality. However, the first moment of the ML estimator does not exist and it is hard to theoretically investigate finite-sample properties of the ML procedure. Besides the ML procedure, the most well-known procedure may be the least squares (LS) procedure. The ordinary LS estimator has finite moments up to some order, but is not asymptotically unbiased. The asymptotic biasedness of the LS estimator is called attenuation bias in the literature (see Fuller (1987)).

This paper addresses a simple linear measurement error model in a finite sample setup, and discusses the problem of reducing the bias and the mean square error (MSE) for slope estimators. Suppose that the and the are observable variables for and , where is the number of groups and is the sample size of each group. Suppose also that the and the have the following model:

 Yi=α0+βγi+δi,Xij=γi+εij, (1.1)

where and are, respectively, unknown intercept and slope parameters, the are unobservable latent variables, and the and the are random error terms. Assume that the and the are mutually independent and distributed as and , respectively, where and

are unknown. It is important to note that the error variance in independent variables,

, can be estimated.

For the latent variables in model (1.1), there are two different points of view, namely, the

are considered as unknown fixed values or as random variables. In the former case, (

1.1) is referred to as a functional model and, in the latter case, is called a structural model (Kendall and Stuart (1979), Anderson (1984) and Cheng and Van Ness (1999)). In this paper, we assume the functional model and shall develop a finite-sample theory of estimating the slope .

The remainder of this paper is organized as follows. In Section 2, we simplify the estimation problem in model (1.1), and define a broad class of slope estimators including the LS estimator, the method of moments estimator, and a Stefanski’s (1985) estimator. Also, Section 2 shows some technical lemmas used for evaluating moments. Section 3 presents a unified method of reducing the bias of the broad class as well as that of the LS estimator. In Section 4, we handle the problem of reducing the MSEs of slope estimators. It is revealed that the slope estimation under the MSE criterion is closely related to the statistical control problem (see Zellner (1971) and Aoki (1989)) and also to the multivariate calibration problem (see Osborne (1991), Brown (1993) and Sundberg (1999)). Our approach to the MSE reduction is carried out in a similar way to Kubokawa and Robert (1994), and a general method is established for improvement of several estimators such as the LS estimator and Guo and Ghosh’s (2012) estimator. Section 5 illustrates numerical performance for the biases and the MSEs of alternative estimators. In Section 6, we point out some remarks on our results and related topics.

## 2 Simplification of the estimation problem

### 2.1 Reparametrized model

Define for . Consider the regression of the on the . The LS estimator of is defined as a unique solution of

 min−∞<β<∞−∞<α0<∞ n∑i=1(Yi−α0−β¯¯¯¯¯Xi)2.

Denote by the resulting ordinary LS estimator of . Then and are given, respectively, by

 ^βLS=∑ni=1(¯¯¯¯¯Xi−¯¯¯¯¯X)(Yi−¯¯¯¯Y)∑ni=1(¯¯¯¯¯Xi−¯¯¯¯¯X)2,^αLS0=¯¯¯¯Y−^βLS¯¯¯¯¯X,

where and .

Let , and . Define

 S=1rn∑i=1r∑j=1(Xij−¯¯¯¯¯Xi)2.

Denote by

the identity matrix of order

and by the

-dimensional vector consisting of ones. It is then observed that

 \boldmathY∼Nn(α01n+β\boldmathγ,τ2In),\boldmathX∼Nn(% \boldmathγ,σ2In),S∼σ2χ2m, (2.1)

for and . Note that , and are mutually independent.

Furthermore, let be an orthogonal matrix whose first row is . Denote and . Define , and , where , and are -dimensional vectors. Then model (2.1) can be replaced with

 (2.2)

These five statistics, and , are mutually independent, and , , , , and are unknown parameters. Throughout this paper, we suppose that .

From reparametrized model (2.2), the ordinary LS estimators and can be rewritten, respectively, as

 ^βLS=\boldmathUt\boldmathZ∥\boldmathU∥2,^αLS=Z0−^βLSU0. (2.3)

Hereafter, we mainly deal with the problem of estimating in reparametrized model (2.2). Denote the bias and the MSE of an estimator , respectively, by

 Bias(^β;β) =E[^β]−β, MSE(^β;β) =E[(^β−β)2],

where the expectation is taken with respect to (2.2). The bias of is smaller than that of another estimator if for any . Similarly, if for any , then the MSE of is said to be better than that of , or is said to dominate .

### 2.2 A class of estimators

If where is a positive value, it follows that and

in probability as

tends to infinity, and hence

 ^βLS→σ2ξσ2ξ+σ2βin probability(n→∞). (2.4)

This implies that the ordinary LS estimator is inconsistent and, more precisely, it is asymptotically biased toward zero. This phenomenon is called attenuation bias (see Fuller (1987)).

For reducing the influence of attenuation bias, various alternatives to have been proposed in the literature. For example, a typical alternative is the method of moments estimator

 ^βMM=\boldmathUt\boldmathZ/p∥\boldmathU∥2/p−S/m. (2.5)

The method of moments estimator converges to in probability as goes to infinity, but does not have finite moments. Noting that and also using the Maclaurin expansion , we obtain the -th order corrected estimator of the form

 ^βSTℓ={1+pmS∥\boldmathU% ∥2+⋯+(pmS∥\boldmathU∥2)ℓ}^βLS. (2.6)

The above estimator can also be derived from using the same arguments as in Stefanski (1985), who approached to the bias correction from Huber’s (1981) M estimation. However, it is still not known whether or not the bias of is smaller than that of in a finite sample situation.

Convergence (2.4) is equivalent that converges to in probability as goes to infinity. Replacing of with a suitable function of yields a general class of estimators,

 ^βϕ={1+ϕ(∥\boldmathU∥2S)}^βLS. (2.7)

Note that and belong to the class (2.7). In this paper, we search a bias-reduced or an MSE-reduced estimator within (2.7) as an alternative to .

### 2.3 Some useful lemmas

Next, we provide some technical lemmas which form the basis for evaluating the bias and MSE of (2.7).

###### Lemma 2.1

Let and . Let be a function on the positive real line. Define and denote by the Poisson probabilities for Let be the p.d.f. of .

1. If then we have

 E[ϕ(∥\boldmathU∥2S)%\boldmath$U$t\boldmathξ∥\boldmathU∥2]=∞∑k=02λp+2kPλ(k)I1(k|ϕ),

where .

2. If then we have

 E[ϕ(∥\boldmathU∥2S)(\boldmathUt\boldmathξ)2∥\boldmathU∥4]=∞∑k=02λ(1+2k)p+2kPλ(k)I2(k|ϕ),

where .

When , (i) and (ii) of Lemma 2.1 are, respectively,

 E[\boldmathUt\boldmathξ∥\boldmathU∥2] =E[2λp+2K]for$p≥2$, (2.8) E[(\boldmathUt\boldmathξ)2∥\boldmathU∥4] =E[2λ(1+2K)(p+2K)(p+2K−2)]for$p≥3$, (2.9)

where is the Poisson random variable with mean . Identities (2.8) and (2.9) have been given, for example, in Nishii and Krishnaiah (1988, Lemma 3).

Proof of Lemma 2.1.  (i) Denote

 E1=E[ϕ(∥\boldmathU∥2S)\boldmathUt\boldmathξ∥\boldmathU∥2].

Let . It turns out that

 E1=(2π)−p/2∫∞0∫Rpϕ(∥\boldmathu∥2s)\boldmathut\boldmathξ1∥\boldmathu∥2e−∥% \boldmathu−\boldmathξ1∥2/2d\boldmathugm(s)ds.

Denote . Let be a orthogonal matrix whose first row is . Making the orthogonal transformation gives that

 E1=c0∫∞0∫Rpϕ(∥\boldmathu∥2s)u1∥\boldmathξ1∥∥\boldmathu∥2e−∥\boldmathu∥2/2+u1∥\boldmathξ1∥d\boldmathugm(s)ds. (2.10)

Now, for , we make the following polar coordinate transformation

 \boldmathu=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝u1u2u3⋮up−1up⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠=ρ⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝cosφsinφcosφ2sinφsinφ2cosφ3⋮sinφsinφ2sinφ3⋯sinφp−2cosφp−1sinφsinφ2sinφ3⋯sinφp−2sinφp−1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠,

where , , and . The Jacobian of transformation is given by , so (2.10) can be rewritten as

 E1=c1∫∞0∫∞0∫π0 ϕ(ρ2s)∥\boldmathξ% 1∥cosφρe−ρ2/2+ρ∥\boldmathξ1∥cosφ ×ρp−1sinp−2φdφdρgm(s)ds,

with

 c1=c0∫2π0dφp−1p−2∏i=2∫π0sinp−i−1φidφi.

Note here that, for an even ,

 ∫π0sinmφcosnφdφ=Γ[(m+1)/2]Γ[(n+1)/2]Γ[(m+n+2)/2]

and, for an odd

, the above definite integral is zero. Thus, it is seen that

 c1=21−p/2π−1/2e−λΓ[(p−1)/2]

and

 ∫π0eρ∥\boldmathξ1∥cosφcosφsinp−2φdφ =∞∑j=0ρj∥\boldmathξ1∥jj!∫π0cosj+1φsinp−2φdφ =∞∑k=0ρ2k+1k!λkπ1/2∥\boldmathξ1∥Γ[(p−1)/2]2k(p+2k)Γ[(p+2k)/2],

so that

 E1=∞∑k=02λp+2kPλ(k)∫∞0∫∞0ϕ(ρ2s)ρp+2k−1e−ρ2/2Γ[(p+2k)/2]2p/2+k−1dρgm(s)ds.

The change of variables leads to completeness of the proof of (i).

(ii) Denote

 E2=E[ϕ(∥\boldmathU∥2S)(\boldmathUt\boldmathξ)2∥\boldmathU∥4].

Using the same arguments as in the proof of (i), we obtain

 E2=c1∫∞0∫∞0∫π0 ϕ(ρ2s)∥\boldmathξ% 1∥2cos2φρ2e−ρ2/2+ρ∥\boldmathξ1∥cosφ ×ρp−1sinp−2φdφdρgm(s)ds,

Since

 ∫π0eρ∥\boldmathξ1∥cosφcos2φsinp−2φdφ=∞∑k=0ρ2kk!λkπ1/2(1+2k)Γ[(p−1)/2]2k(p+2k)Γ[(p+2k)/2],

it is observed that

 E2=∞∑k=0Pλ(k)2λ(1+2k)p+2kI2(k|ϕ),

where

 I2(k|ϕ) =∫∞0∫∞01ρ2ϕ(ρ2s)ρp+2k−1e−ρ2/2Γ[(p+2k)/2]2p/2+k−1dρgm(s)ds =∫∞0∫∞01wϕ(ws)gp+2k(w)dwgm(s)ds.

Hence the proof of (ii) is complete. ∎

###### Lemma 2.2

Let . Let be a natural number such that . Denote by the Poisson random variable with mean . Then we have

Proof.  We employ the same notation as in Lemma 2.1. Note that, when ,

follows the noncentral chi-square distribution with

degrees of freedom and noncentrality parameter . Since the p.d.f. of the noncentral chi-square distribution is given by , it is seen that

 E[σ2i∥\boldmathU∥2i] =∞∑k=0Pλ(k)∫∞0w−igp+2k(w)dw =∞∑k=0Pλ(k)i∏j=1(p+2k−2j)−1=E[i∏j=1(p+2K−2j)−1]

for . If , then , so that for . Thus the proof is complete. ∎

The following lemma is given in Hudson (1978).

###### Lemma 2.3

Let be a Poisson random variable with mean . Let be a function satisfying and . Then we have .

## 3 Bias reduction

In this section, some results are presented for the bias reduction in slope estimation. First, we give an alternative expression for the bias of the LS estimator .

###### Lemma 3.1

Let be a Poisson random variable with mean . If , then the bias of is finite. Furthermore, if , the bias of can be expressed as

 Bias(^βLS;β)=−E[p−2p+2K−2]β.

Proof.  Using identity (2.8) gives that for

 Bias(^βLS;β)=E[\boldmathUt\boldmathξ∥\boldmathU∥2]β−β=E[2λp+2K]β−β. (3.1)

If , we apply Lemma 2.3 to (3.1) so as to obtain

 Bias(^βLS;β)=E[2Kp+2K−2]β−β=−E[p−2p+2K−2]β.

Hence the proof is complete. ∎

Let be a nonnegative integer. Define a simple modification of , given in (2.6), as

 (3.2)

where and for , and . We then obtain the following lemma.

###### Lemma 3.2

Let be a Poisson random variable with mean . Assume that . If , then can be expressed as

 Bias(^βBRℓ;β)=−E[ℓ+1∏j=1p−2jp+2K−2j]β.

Proof.  We prove a case when because the case is equivalent to Lemma 3.1. Note that

 E[^βBRℓ]=E[^βLS]+E[ℓ∑j=1ajbj(S∥\boldmathU∥2)j\boldmathUt\boldmathξ∥\boldmathU∥2]β,

which implies from Lemma 3.1 that

 Bias(^βBRℓ;β)=−E[p−2p+2K−2]β+E[ℓ∑j=1ajbj(S∥\boldmathU∥2)j\boldmathUt\boldmathξ∥\boldmathU∥2]β. (3.3)

Since for when , using (i) of Lemma 2.1 and Lemma 2.2 gives

 E[ajbj(S∥\boldmathU∥2)j\boldmathUt\boldmathξ∥\boldmathU∥2] =ajbj∞∑k=02λp+2kPλ(k)∫∞0∫∞0sjwjgp+2k(w)dwgm(s)ds =aj∞∑k=02λp+2kPλ(k)∫∞01wjgp+2k(w)dw =E[2λp+2Kj∏i=1p−2ip+2K−2i] (3.4)

for . Applying Lemma 2.3 to (3) gives that for

 E[ajbj(S∥\boldmathU∥2)j\boldmathUt\boldmathξ∥% \boldmathU∥2]=E[2Kp+2K−2j∏i=1p−2ip+2K−2i−2],

which is substituted into (3.3) to obtain

It is here observed that

 p−2p+2K−2−2Kp+2K−2ℓ∑j=1j∏i=1p−2ip+2K−2i−2 =2∏j=1p−2jp+2K−2j−2Kp+2K−2ℓ∑j=2j∏i=1p−2ip+2K−2i−2 =⋯=ℓ+1∏j=1p−2jp+2K−2j,

which yields that, for ,

 Bias(^βBRℓ;β)=−E[ℓ+1∏j=1p−2jp+2K−2j]β.

Hence the proof is complete. ∎

###### Example 3.1

If is a nonnegative integer and , it follows that

 0<ℓ+1∏j=1p−2jp+2k−2j≤ℓ∏j=1p−2jp+2k−2j≤⋯≤p−2p+2k−2.

Combining Lemmas 3.1 and 3.2 immediately yields that, for any ,

 |Bias(^βBRℓ;β)|≤|Bias(^βBRℓ−1;β)|≤⋯≤|Bias(^βBR1;β)|≤|Bias(^βLS;β)|

if .

The following theorem specifies a general condition that , given in (2.7), reduces the bias of in a finite sample setup.

###### Theorem 3.1

Assume that . Let the and the be defined as in (3.2). Assume that is bounded as for any and a fixed natural number . If , then we have for any .

Proof.  Using the same arguments as in (3.3), we can express as , where

 E0=E[p−2p+2K−2],Eϕ=E[ϕ(∥\boldmathU∥2S)\boldmathUt\boldmathξ∥\boldmathU∥2].

From Lemma 3.1, it suffices to show that or, equivalently, that

 −2E0≤−2E0+Eϕ