Estimation in a simple linear regression model with measurement error

04/09/2018
by   Hisayuki Tsukuma, et al.
0

This paper deals with the problem of estimating a slope parameter in a simple linear regression model, where independent variables have functional measurement errors. Measurement errors in independent variables, as is well known, cause biasedness of the ordinary least squares estimator. A general procedure for the bias reduction is presented in a finite sample situation, and some exact bias-reduced estimators are proposed. Also, it is shown that certain truncation procedures improve the mean square errors of the ordinary least squares and the bias-reduced estimators.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/11/2018

Optimal Estimators in Misspecified Linear Regression Model

In this article, we propose the Sample Information Optimal Estimator (SI...
07/08/2018

Measurement Errors as Bad Leverage Points

Errors-in-variables is a long-standing, difficult issue in linear regres...
04/03/2019

Measurement error induced by locational uncertainty when estimating discrete choice models with a distance as a regressor

Spatial microeconometric studies typically suffer from various forms of ...
12/22/2014

An {l_1,l_2,l_∞}-Regularization Approach to High-Dimensional Errors-in-variables Models

Several new estimation methods have been recently proposed for the linea...
10/18/2021

On completing a measurement model by symmetry

An appeal for symmetry is made to build established notions of specific ...
06/26/2020

Prediction in polynomial errors-in-variables models

A multivariate errors-in-variables (EIV) model with an intercept term, a...
06/26/2019

Control variate selection for Monte Carlo integration

Monte Carlo integration with variance reduction by means of control vari...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Linear regression model with measurement errors in independent variables is of practical importance, and many theoretical and experimental approaches have been studied extensively for a long time. Adcock (1877, 1878) first treated estimation of the slope in a simple linear measurement error model and derived the maximum likelihood (ML) estimator, which nowadays is known as orthogonal regression estimator (see Anderson (1984)). Reiersøl (1950) has investigated identifiability related to possibility of constructing a consistent estimator. For efficient estimation, see Bickel and Ritov (1987) and, for consistent estimation based on shrinkage estimators, see Whittemore (1989) and Guo and Ghosh (2012). A multivariate generalization of univariate linear measurement error model has been considered by Gleser (1981). See Anderson (1984), Fuller (1987) and Cheng and Van Ness (1999) for a systematic overview of theoretical development in estimation of linear measurement error models.

Even though many estimation procedures for the slope have been developed and proposed, each procedure generally has both theoretical merits and demerits. The ML estimator possesses consistency and asymptotic normality. However, the first moment of the ML estimator does not exist and it is hard to theoretically investigate finite-sample properties of the ML procedure. Besides the ML procedure, the most well-known procedure may be the least squares (LS) procedure. The ordinary LS estimator has finite moments up to some order, but is not asymptotically unbiased. The asymptotic biasedness of the LS estimator is called attenuation bias in the literature (see Fuller (1987)).

This paper addresses a simple linear measurement error model in a finite sample setup, and discusses the problem of reducing the bias and the mean square error (MSE) for slope estimators. Suppose that the and the are observable variables for and , where is the number of groups and is the sample size of each group. Suppose also that the and the have the following model:

(1.1)

where and are, respectively, unknown intercept and slope parameters, the are unobservable latent variables, and the and the are random error terms. Assume that the and the are mutually independent and distributed as and , respectively, where and

are unknown. It is important to note that the error variance in independent variables,

, can be estimated.

For the latent variables in model (1.1), there are two different points of view, namely, the

are considered as unknown fixed values or as random variables. In the former case, (

1.1) is referred to as a functional model and, in the latter case, is called a structural model (Kendall and Stuart (1979), Anderson (1984) and Cheng and Van Ness (1999)). In this paper, we assume the functional model and shall develop a finite-sample theory of estimating the slope .

The remainder of this paper is organized as follows. In Section 2, we simplify the estimation problem in model (1.1), and define a broad class of slope estimators including the LS estimator, the method of moments estimator, and a Stefanski’s (1985) estimator. Also, Section 2 shows some technical lemmas used for evaluating moments. Section 3 presents a unified method of reducing the bias of the broad class as well as that of the LS estimator. In Section 4, we handle the problem of reducing the MSEs of slope estimators. It is revealed that the slope estimation under the MSE criterion is closely related to the statistical control problem (see Zellner (1971) and Aoki (1989)) and also to the multivariate calibration problem (see Osborne (1991), Brown (1993) and Sundberg (1999)). Our approach to the MSE reduction is carried out in a similar way to Kubokawa and Robert (1994), and a general method is established for improvement of several estimators such as the LS estimator and Guo and Ghosh’s (2012) estimator. Section 5 illustrates numerical performance for the biases and the MSEs of alternative estimators. In Section 6, we point out some remarks on our results and related topics.

2 Simplification of the estimation problem

2.1 Reparametrized model

Define for . Consider the regression of the on the . The LS estimator of is defined as a unique solution of

Denote by the resulting ordinary LS estimator of . Then and are given, respectively, by

where and .

Let , and . Define

Denote by

the identity matrix of order

and by the

-dimensional vector consisting of ones. It is then observed that

(2.1)

for and . Note that , and are mutually independent.

Furthermore, let be an orthogonal matrix whose first row is . Denote and . Define , and , where , and are -dimensional vectors. Then model (2.1) can be replaced with

(2.2)

These five statistics, and , are mutually independent, and , , , , and are unknown parameters. Throughout this paper, we suppose that .

From reparametrized model (2.2), the ordinary LS estimators and can be rewritten, respectively, as

(2.3)

Hereafter, we mainly deal with the problem of estimating in reparametrized model (2.2). Denote the bias and the MSE of an estimator , respectively, by

where the expectation is taken with respect to (2.2). The bias of is smaller than that of another estimator if for any . Similarly, if for any , then the MSE of is said to be better than that of , or is said to dominate .

2.2 A class of estimators

If where is a positive value, it follows that and

in probability as

tends to infinity, and hence

(2.4)

This implies that the ordinary LS estimator is inconsistent and, more precisely, it is asymptotically biased toward zero. This phenomenon is called attenuation bias (see Fuller (1987)).

For reducing the influence of attenuation bias, various alternatives to have been proposed in the literature. For example, a typical alternative is the method of moments estimator

(2.5)

The method of moments estimator converges to in probability as goes to infinity, but does not have finite moments. Noting that and also using the Maclaurin expansion , we obtain the -th order corrected estimator of the form

(2.6)

The above estimator can also be derived from using the same arguments as in Stefanski (1985), who approached to the bias correction from Huber’s (1981) M estimation. However, it is still not known whether or not the bias of is smaller than that of in a finite sample situation.

Convergence (2.4) is equivalent that converges to in probability as goes to infinity. Replacing of with a suitable function of yields a general class of estimators,

(2.7)

Note that and belong to the class (2.7). In this paper, we search a bias-reduced or an MSE-reduced estimator within (2.7) as an alternative to .

2.3 Some useful lemmas

Next, we provide some technical lemmas which form the basis for evaluating the bias and MSE of (2.7).

Lemma 2.1

Let and . Let be a function on the positive real line. Define and denote by the Poisson probabilities for Let be the p.d.f. of .

  1. If then we have

    where .

  2. If then we have

    where .

When , (i) and (ii) of Lemma 2.1 are, respectively,

(2.8)
(2.9)

where is the Poisson random variable with mean . Identities (2.8) and (2.9) have been given, for example, in Nishii and Krishnaiah (1988, Lemma 3).

Proof of Lemma 2.1.  (i) Denote

Let . It turns out that

Denote . Let be a orthogonal matrix whose first row is . Making the orthogonal transformation gives that

(2.10)

Now, for , we make the following polar coordinate transformation

where , , and . The Jacobian of transformation is given by , so (2.10) can be rewritten as

with

Note here that, for an even ,

and, for an odd

, the above definite integral is zero. Thus, it is seen that

and

so that

The change of variables leads to completeness of the proof of (i).

(ii) Denote

Using the same arguments as in the proof of (i), we obtain

Since

it is observed that

where

Hence the proof of (ii) is complete. ∎

Lemma 2.2

Let . Let be a natural number such that . Denote by the Poisson random variable with mean . Then we have

Proof.  We employ the same notation as in Lemma 2.1. Note that, when ,

follows the noncentral chi-square distribution with

degrees of freedom and noncentrality parameter . Since the p.d.f. of the noncentral chi-square distribution is given by , it is seen that

for . If , then , so that for . Thus the proof is complete. ∎

The following lemma is given in Hudson (1978).

Lemma 2.3

Let be a Poisson random variable with mean . Let be a function satisfying and . Then we have .

3 Bias reduction

In this section, some results are presented for the bias reduction in slope estimation. First, we give an alternative expression for the bias of the LS estimator .

Lemma 3.1

Let be a Poisson random variable with mean . If , then the bias of is finite. Furthermore, if , the bias of can be expressed as

Proof.  Using identity (2.8) gives that for

(3.1)

If , we apply Lemma 2.3 to (3.1) so as to obtain

Hence the proof is complete. ∎

Let be a nonnegative integer. Define a simple modification of , given in (2.6), as

(3.2)

where and for , and . We then obtain the following lemma.

Lemma 3.2

Let be a Poisson random variable with mean . Assume that . If , then can be expressed as

Proof.  We prove a case when because the case is equivalent to Lemma 3.1. Note that

which implies from Lemma 3.1 that

(3.3)

Since for when , using (i) of Lemma 2.1 and Lemma 2.2 gives

(3.4)

for . Applying Lemma 2.3 to (3) gives that for

which is substituted into (3.3) to obtain

It is here observed that

which yields that, for ,

Hence the proof is complete. ∎

Example 3.1

If is a nonnegative integer and , it follows that

Combining Lemmas 3.1 and 3.2 immediately yields that, for any ,

if .

The following theorem specifies a general condition that , given in (2.7), reduces the bias of in a finite sample setup.

Theorem 3.1

Assume that . Let the and the be defined as in (3.2). Assume that is bounded as for any and a fixed natural number . If , then we have for any .

Proof.  Using the same arguments as in (3.3), we can express as , where

From Lemma 3.1, it suffices to show that or, equivalently, that