New Cramer-Rao-Type Bound for Constrained Parameter Estimation

02/07/2018
by   Eyal Nitzan, et al.
Ben-Gurion University of the Negev
0

Non-Bayesian parameter estimation under parametric constraints is encountered in numerous applications in signal processing, communications, and control. Mean-squared-error (MSE) lower bounds are widely used as performance benchmarks and for system design. The well-known constrained Cramer-Rao bound (CCRB) is a lower bound on the MSE of estimators that satisfy some unbiasedness conditions. In many constrained estimation problems, these unbiasedness conditions are too strict and popular estimators, such as the constrained maximum likelihood estimator, do not satisfy them. In addition, MSE performance can be uniformly improved by implementing estimators that do not satisfy these conditions. As a result, the CCRB is not a valid bound on the MSE of such estimators. In this paper, we propose a new definition for unbiasedness in constrained settings, denoted by C-unbiasedness, which is based on using Lehmann-unbiasedness with a weighted MSE (WMSE) risk and taking into account the parametric constraints. In addition, a Cramer-Rao-type bound on the WMSE of C-unbiased estimators, denoted as Lehmann-unbiased CCRB (LU-CCRB), is derived. It is shown that in general, C-unbiasedness is less restrictive than the CCRB unbiasedness conditions. Thus, the LU-CCRB is valid for a larger set of estimators than the CCRB and C-unbiased estimators with lower WMSE than the corresponding CCRB may exist. In the simulations, we examine linear and nonlinear estimation problems under nonlinear parametric constraints in which the constrained maximum likelihood estimator is shown to be C-unbiased and the LU-CCRB is an informative bound on its WMSE, while the corresponding CCRB is not valid.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/15/2019

Cramer-Rao Bound for Estimation After Model Selection and its Application to Sparse Vector Estimation

In many practical parameter estimation problems, such as coefficient est...
03/12/2022

Optimal Precoding Design for Monostatic ISAC Systems: MSE Lower Bound and DoF Completion

In this letter, we study the parameter estimation performance for monost...
11/23/2017

Constrained Best Linear Unbiased Estimation

The least squares (LS) estimator and the best linear unbiased estimator ...
10/24/2021

Learning to Estimate Without Bias

We consider the use of deep learning for parameter estimation. We propos...
05/02/2022

Material Facts Obscured in Hansen's Modern Gauss-Markov Theorem

We show that the abstract and conclusion of Hansen's Econometrica paper,...
01/03/2020

A Two-Stage Batch Algorithm for Nonlinear Static Parameter Estimation

A two-stage batch estimation algorithm for solving a class of nonlinear,...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In the non-Bayesian framework, the Cramr-Rao bound (CRB) [1, 2],[3] provides a lower bound on the mean-squared-error (MSE) matrix of any mean-unbiased estimator and is used as a benchmark for parameter estimation performance analysis. In some cases, scalar risks for multi-parameter estimation are of interest, for example due to tractability or complexity issues. Corresponding Cramr-Rao-type bounds for this case, can be found in e.g. [4, 5, 6]. In constrained parameter estimation [7]

, the unknown parameter vector satisfies given parametric constraints. In some cases, the CRB for constrained parameter estimation can be obtained by a reparameterization of the original problem. However, this approach may be intractable and may hinder insights into the original unconstrained problem

[8]. In addition, mean-unbiased estimators may not exist for the reparameterized problem, as occurs in cases where the resulting distribution is periodic [9], [10].
In the pioneering work in [7], the constrained CRB (CCRB) was derived for constrained parameter estimation without reparameterizing the original problem. A simplified derivation of the CCRB was presented in [11]. The CCRB was extended for various cases, such as parameter estimation with a singular Fisher information matrix (FIM) in [8], complex vector parameter estimation in [12], biased estimation in [13], and sparse parameter vector estimation in [14]. Alternative derivations of the CCRB from a model fitting perspective and via norm minimization were presented in [15] and [16], respectively. A hybrid Bayesian and non-Bayesian CCRB and the CCRB under misspecified models were derived in [17] and [18], respectively. Computations of the CCRB in various applications can be found, for example, in [19, 20, 21, 22, 23, 24]. In addition to the CCRB, Cramr-Rao-type bounds for estimation of parameters constrained to lie on a manifold were derived in [25, 26, 27, 28]. The constrained Bhattacharyya bound was derived in [29] and the constrained Hammersley-Chapman-Robbins (HCR) bound was derived in [7] by using the classical HCR bound in which the test-points were taken from the constrained set.
A popular estimator for constrained parameter estimation is the constrained maximum likelihood (CML) estimator [11, 15, 30, 31, 32, 33, 34, 35, 36]. This estimator is obtained by maximizing the likelihood function subject to parametric constraints. It is shown in [11, 15] for nonsingular and singular FIM, respectively, that if there exists a mean-unbiased estimator satisfying the constraints that achieves the CCRB, then this estimator is a stationary point of the constrained likelihood maximization. Asymptotic properties of the CML estimator can be found in [30, 31, 32, 33, 34, 35, 36], under different assumptions. In particular, under mild assumptions, the CML estimator asymptotically satisfies the CCRB unbiasedness conditions and attains the CCRB for both linear and nonlinear constraints, as shown in [33] and [36], respectively. However, in the non-asymptotic region the CML estimator may not satisfy the CCRB unbiasedness conditions [37, 38] and therefore, the CCRB may not be an informative lower bound for CML performance in the non-asymptotic region. Other estimation methods for constrained parameter estimation are based on minimax criteria, e.g. [39, 40], and least squares criteria, e.g. [36, 41, 42].
It is well known that unrestricted minimization of the non-Bayesian MSE yields the trivial, parameter-dependent estimator. In order to avoid this loophole, mean-unbiasedness of estimators is usually imposed [43, 44], i.e. only estimators with zero bias are considered. In early works on constrained parameter estimation [8, 11, 15, 36], the CCRB was assumed to be a lower bound for estimators that satisfy the constraints and have zero bias in the constrained set. It was shown in [37] that zero-bias requirement may be too strict. In addition, it was shown (e.g. [13, 35]) that the CCRB can be derived without requiring the estimator to satisfy the constraints. The unbiasedness conditions of the CCRB were thoroughly discussed in [14] and were shown to be less restrictive than the unbiasedness conditions of the conventional CRB. However, the CCRB unbiasedness conditions may still be too strict for commonly-used estimators, such as the CML estimator.
In this paper, the concept of unbiasedness in the Lehmann sense under parametric constraints, named C-unbiasedness, is developed. The Lehmann-unbiasedness [43, 45] generalizes the mean-unbiasedness to arbitrary cost functions and arbitrary parameter space. It has been used in various works for derivation of performance bounds under different cost functions [46, 47, 48, 49]. Using the C-unbiasedness concept, we derive a new constrained Cramr-Rao-type lower bound, named Lehmann-unbiased CCRB (LU-CCRB), on the weighted MSE (WMSE) [50, 39, 51, 52] of any C-unbiased estimator. It is shown that for linear constraints and/or in the asymptotic region, the proposed LU-CCRB coincides with the corresponding CCRB. In the simulations, the CML estimator is shown to be C-unbiased for orthogonal linear estimation problem under norm constraint and for complex amplitude estimation with amplitude constraint and unknown frequency. Therefore, the LU-CCRB is a lower bound for CML performance in these cases. In contrast, the corresponding CCRB on the WMSE is not a lower bound in the considered cases, in the non-asymptotic region, and is shown to be significantly higher than the WMSE of the CML estimator. These results demonstrate that the LU-CCRB provides an informative WMSE lower bound in cases, where the corresponding CCRB on the WMSE, and consequently also the matrix CCRB, are not lower bounds.
The WMSE is a scalar risk for multi-parameter estimation that allows the consideration of any weighted sum of squared linear combinations of the estimation errors. In particular, the MSE matrix trace is a special case of the WMSE. Unlike the CCRB, which is a matrix lower bound, the proposed LU-CCRB is a family of scalar bounds, which provides a different lower bound for each weighted sum of squared linear combinations of the estimation errors under corresponding C-unbiasedness condition. An early derivation of C-unbiasedness and lower bounds on a projected MSE matrix appear in the conference paper [53]. In this work, we focus on WMSE rather than the projected MSE.
The remainder of the paper is organized as follows. In Section II, we define the notations and present relevant background for this paper. The C-unbiasedness and the LU-CCRB are derived in Sections III and IV, respectively. Our simulations appear in Section V. In Section VI, we give our conclusions.

Ii Notations and background

Ii-a Notations and constrained model

Throughout this paper, we denote vectors by boldface lowercase letters and matrices by boldface uppercase letters. The th element of the vector and the th element of the matrix are denoted by and , respectively. A subvector of with indices is denoted by

. The identity matrix of dimension

is denoted by and denotes a vector/matrix of zeros. The notations and denote the trace and vectorization operators, where the vectorization operator stacks the columns of its input matrix into a column vector. The notations , , and denote the transpose, inverse, and Moore-Penrose pseudo-inverse, respectively. The notation implies that is a positive semidefinite matrix. The column and null spaces of a matrix are denoted by and , respectively. The matrices and are the orthogonal projection matrices onto and , respectively [54]. The notation is the Kronecker product of the matrices and . The gradient of a vector function of , , is a matrix in which . The real and imaginary parts of an argument are denoted by and , respectively, and . The notation stands for the phase of a complex scalar, which is assumed to be restricted to the interval .
Let

denote a probability space, where

is the observation space, is the -algebra on , and is a family of probability measures parameterized by the deterministic unknown parameter vector . Each probability measure

is assumed to have an associated probability density function (pdf),

, such that the expected value of any measurable function with respect to (w.r.t.) satisfies . For simplicity of notations, we omit from the notation of expectation and denote it by , whenever the value of is clear from the context. The conditional expectation given event and parameterized by is denoted by .
We suppose that is restricted to the set

(1)

where is a continuously differentiable function. It is assumed that and that the matrix, , has full row rank for any , i.e. the constraints are not redundant. Thus, for any there exists a matrix , such that

(2)

and

(3)

The case implies an unconstrained estimation problem in which . Under the assumption that each element of is differentiable w.r.t. , we define

(4)

where is the th column of .
An estimator of based on a random observation vector is denoted by , where does not necessarily satisfy the constraints. For the sake of simplicity, in the following is replaced by . The bias of an estimator is denoted by

(5)

Under the assumption that each element of is differentiable w.r.t. , we define the bias gradient

(6)

Ii-B WMSE and CCRB

In this paper, we are interested in estimation under a weighted squared-error (WSE) cost function [50, 39, 51, 52],

(7)

where is a positive semidefinite weighting matrix. The WMSE risk is obtained by taking the expectation of (7) and is given by

(8)

The WMSE is in fact a family of scalar risks for estimation of an unknown parameter vector, where for each we obtain a different risk. Therefore, the WMSE allows flexibility in the design of estimators and the derivation of performance bounds. For example, by choosing we obtain the special case of the MSE matrix trace. Another example is when one may wish to consider the estimation of each element of the unknown parameter vector separately. Moreover, can compensate for possibly different units of the parameter vector elements. Another example is estimation in the presence of nuisance parameters, where we are only interested in the MSE for estimation of a subvector of the unknown parameter vector (see e.g. [38], [43, p. 461]) and thus, includes zero elements for the nuisance parameters.
Let

(9)

and the FIM

(10)

At , under the assumption

(11)

the CCRB is given by [8, 13]

(12)

The CCRB is an MSE matrix lower bound that can be reformulated as a WMSE lower bound by multiplying the bound by the weighting matrix, , and taking the trace. That is, based on the matrix CCRB from (12) we obtain the following WMSE lower bound

(13)

Computations of the CCRB on the WMSE with different weighting matrices can be found in e.g. [8, 36, 38]. In the following, we refer to the WMSE lower bound in (13) as the CCRB for the considered choice of weighting matrix.
It is known that the CRB is a local bound, which is a lower bound for estimators whose bias and bias gradient vanish at a considered point (see e.g. [14]), that is, locally mean-unbiased estimators in the vicinity of this point. In [14], local -unbiasedness is defined as follows:

Definition 1.

The estimator is said to be a locally -unbiased estimator in the vicinity of if it satisfies

(14)

and

(15)

It is shown in [14] that the CCRB is a lower bound for locally -unbiased estimators, where local -unbiasedness is a weaker restriction than local mean-unbiasedness. As a result, the CCRB is always lower than or equal to the CRB. In the following section, we derive a different unbiasedness definition for constrained parameter estimation, named C-unbiasedness, whose local definition is less restrictive than local -unbiasedness.

Iii Unbiasedness under constraints

In non-Bayesian parameter estimation, direct minimization of the risk w.r.t. the estimator results in a trivial estimator. Accordingly, one needs to exclude such estimators by additional restrictions on the considered set of estimators. A common restriction on estimators is mean-unbiasedness, which is used for derivation of the CRB. In the following, we propose a novel unbiasedness restriction for constrained parameter estimation, named C-unbiasedness, which is based on Lehmann’s definition of unbiasedness. It is shown that local C-unbiasedness is a weaker restriction than the local unbiasedness restrictions of the CCRB. Therefore, local C-unbiasedness allows for a larger set of estimators to be considered.

Iii-a Lehmann-unbiasedness

Lehmann [43, 45] proposed a generalization of the unbiasedness concept based on the considered cost function and parameter space, as presented in the following definition.

Definition 2.

The estimator is said to be a uniformly unbiased estimator of in the Lehmann sense [43, 45] w.r.t. the cost function if

(16)

where is the parameter space.

The Lehmann-unbiasedness definition implies that an estimator is unbiased if on the average it is “closest” to the true parameter, , rather than to any other value in the parameter space, . The measure of closeness between the estimator and the parameter is the cost function, . For example, in [45] it is shown that under the scalar squared-error cost function, , the Lehmann-unbiasedness in (16) is reduced to the conventional mean-unbiasedness, , . Lehmann-unbiasedness conditions for various cost functions can be found in [46, 47, 48, 49].
In non-Bayesian estimation theory, two types of unbiasedness are usually considered: uniform unbiasedness in which the estimator is unbiased at any point in the parameter space and local unbiasedness (see e.g. [55]) in which the estimator is assumed to be unbiased only in the vicinity of the true parameter . In the following definition, we extend the original uniform definition of Lehmann-unbiasedness in (16) to local Lehmann-unbiasedness.

Definition 3.

The estimator is said to be a locally Lehmann-unbiased estimator in the vicinity of w.r.t. the cost function if

(17)

for any , s.t. .

Iii-B Uniform C-unbiasedness

In the following, the uniform C-unbiasedness is derived by combining the uniform Lehmann-unbiasedness condition from (16) w.r.t. the WSE cost function and the parametric constraints.

Proposition 1.

A necessary condition for an estimator to be a uniformly unbiased estimator of in the Lehmann sense w.r.t. the WSE cost function and under the constrained set in (1) is

(18)
Proof.

By substituting and the WSE cost function from (7) in (16), one obtains that Lehmann-unbiasedness in the constrained setting is reduced to:

(19)

The condition in (19) is equivalent to requiring , where is the minimizer of the following constrained minimization problem

(20)

By using a necessary condition for constrained minimization (see e.g. Eq. (1.62) in [56]), it can be shown that the minimizer of (20), , must satisfy

(21)

Under the assumption that integration w.r.t. and derivatives w.r.t. can be reordered, the condition in (21) is equivalent to

(22)

Finally, by substituting and (5) in (22), one obtains (18). ∎

An estimator that satisfies (18) is said to be uniformly C-unbiased. The uniform C-unbiasedness is a necessary condition for uniform Lehmann-unbiasedness w.r.t. the WSE cost function and under the constrained set in (1). It can be seen that if an estimator has zero mean-bias in the constrained set, i.e. , , then it satisfies (18) but not vice versa. Thus, the uniform C-unbiasedness condition is a weaker condition than requiring mean-unbiasedness in the constrained set.

Iii-C Local C-unbiasedness

In this subsection, local C-unbiasedness conditions are derived by combining the local Lehmann-unbiasedness condition from (17) w.r.t. the WSE cost function and the parametric constraints.

Proposition 2.

Necessary conditions for an estimator to be a locally Lehmann-unbiased estimator in the vicinity of w.r.t. the WSE cost function and under the constrained set in (1) are

(23)

and

(24)

, where and are defined in (4) and (6), respectively.

Proof.

The proof is given in Appendix A. ∎

In particular, in case the MSE matrix trace is of interest, we substitute in (23)-(24) and the resulting local C-unbiasedness conditions are

(25)

and

(26)

.
For any positive semidefinite matrix , it can be seen that if an estimator satisfies (14)-(15), then it satisfies also (23)-(24) but not vice versa. Thus, for any positive semidefinite weighting matrix , the local C-unbiasedness is a weaker restriction than the local -unbiasedness and therefore, lower bounds on the WMSE of locally C-unbiased estimators may be lower than the corresponding CCRB. In Section V, we show examples in which the CML estimator is C-unbiased and is not -unbiased. In case that some of the elements of are considered as nuisance parameters, we can put zero weights on these elements in the weighting matrix . It can be seen that in this case, the local C-unbiasedness conditions from (23)-(24) are not affected by the bias function of a nuisance parameter estimator.

Iv Lu-Ccrb

In this section, we derive the LU-CCRB, which is a new Cramr-Rao-type lower bound on the WMSE of locally C-unbiased estimators, where the WMSE is defined in (8). Properties of this bound are described in Subsection IV-B.

Iv-a Derivation of LU-CCRB

In the following theorem, we derive the LU-CCRB on the WMSE of locally C-unbiased estimators. For the derivation we define , which is a block matrix whose th block is given by

(27)

, where

(28)

is an matrix, . Finally, we define the matrix

(29)

In the following theorem, we present the proposed LU-CCRB on the WMSE of locally C-unbiased estimators, where the local C-unbiasedness conditions are presented in (23)-(24). The LU-CCRB is a family of scalar bounds, which provides a different lower bound for each choice of weighting matrix.

Theorem 3.

Let be a locally C-unbiased estimator of in the vicinity of for a given positive semidefinite weighting matrix and assume

  1. Integration w.r.t. and differentiation w.r.t. at can be interchanged.

  2. Each element of is differentiable w.r.t. at .

  3. is finite.

Then,

(30)

where

(31)

Equality in (30) is obtained iff

(32)

where

(33)

.

Proof.

The proof is given in Appendix B. ∎

It can be seen that computation of the CCRB requires the evaluation of [36], while computation of the LU-CCRB requires the evaluation of and the matrices . The matrix can be evaluated numerically by using and and applying the product rule on (2) and (3), .
In order to obtain a lower bound on the MSE matrix trace, we substitute and (3), in (31) and obtain

(34)

where

(35)

and

(36)

.

Iv-B Properties of LU-CCRB

Iv-B1 Relation to CCRB

In the following proposition, we show the condition for and to coincide.

Proposition 4.

Assume that and exist and that

(37)

Then,

(38)
Proof.

By substituting (37) in (29), we obtain

(39)

Substituting (39) in (31) and using the equality [57, p. 22]

(40)

one obtains

(41)

By substituting the equality [57, p. 60]

(42)

with , , and in (41), we obtain

(43)

where the second equality is obtained by using trace and pseudo-inverse properties and substituting (13). ∎

It can be shown that (37) is satisfied, for example, for linear constraints,

(44)

In this case, both the constraint gradient matrix, , and the orthonormal null space matrix, , from (2)-(3), are not functions of . Therefore, the derivatives of the elements of w.r.t. are zero, i.e. , and it can be verified by using (27)-(28) that (37) is satisfied. Therefore, for linear constraints, the proposed LU-CCRB from (31) coincides with the corresponding CCRB from (13). In particular, for linear Gaussian model under linear constraints, the CML is an -unbiased and C-unbiased estimator, and achieves the CCRB [36] and the LU-CCRB.

Iv-B2 Order relation

In the following proposition, we show that for the general case, the proposed LU-CCRB from (31) is lower than or equal to the corresponding CCRB from (13).

Proposition 5.

Assume that and exist and that (11) holds. Then,

(45)
Proof.

The proof is given in Appendix C. ∎

The LU-CCRB requires local C-unbiasedness and the CCRB requires local -unbiasedness, as mentioned in Subsections IV-A and II-B, respectively. The local C-unbiasedness is sufficient and less restrictive than local -unbiasedness, as mentioned in Subsection III-C. Therefore, the set of estimators for which the LU-CCRB is a lower bound contains the set of estimators for which the CCRB is a lower bound. This result elucidates the order relation in (45). In Section V, we show examples in which the CML estimator is C-unbiased and is not -unbiased. As a result, in these examples the LU-CCRB is a lower bound on the WMSE of the CML estimator, while the CCRB on the WMSE is not necessarily a lower bound in the non-asymptotic region. The considered examples indicate that C-unbiasedness and the proposed LU-CCRB are more appropriate than -unbiasedness and the CCRB, respectively, for constrained parameter estimation.

Iv-B3 Asymptotic properties

It is well known that under some conditions [36], the CCRB is attained asymptotically by the CML estimator. Consequently, the WMSE of the CML estimator asymptotically coincides with from (13). Therefore, it is of interest to compare and in the asymptotic regime, i.e. when the number of independent identically distributed (i.i.d.) observation vectors tends to infinity. In the following proposition, we show the asymptotic relation between and .

Proposition 6.

Assume that and exist and are nonzero. Then, given i.i.d. observation vectors,

(46)
Proof.

For brevity, we remove the arguments of the functions that appear in this proof. Let denote the FIM based on a single observation vector. Then, for i.i.d. observation vectors (see e.g. [3, p. 213])

(47)

By substituting (47) in (29), we obtain

(48)

Then, by substituting (48) in (31), we obtain the LU-CCRB on the WMSE based on i.i.d. observation vectors, which is given by

(49)

By applying (42) on the right hand side (r.h.s.) of (13) and using the properties of the trace and pseudo-inverse, it can be verified that

(50)

By using (40) and substituting (47) in (50), one obtains

(51)

Under the assumption that exists, the elements of are bounded while the term is proportional to . Therefore, for it can be verified from (49) and (51) that (46) is satisfied. ∎

From Proposition 6 it can be seen that and asymptotically coincide. Consequently, under mild assumptions and similar to CCRB, is asymptotically attained by the CML estimator. In addition, it should be noted that Proposition 6 can be generalized and the term tends to in any case where the FIM increases (in a matrix inequality sense), for example due to increasing signal-to-noise ratio, while the elements of are bounded.

V Examples

In this section, we evaluate the proposed LU-CCRB in two scenarios. In the first scenario, we consider an orthogonal linear model with norm constraint and in the second scenario we consider complex amplitude estimation with amplitude constraint and unknown frequency. For both scenarios, it is shown that the CCRB on the WMSE is not a lower bound on the WMSE of the CML estimator in the non-asymptotic region. In contrast, we show that the CML estimator is a C-unbiased estimator and thus, the proposed LU-CCRB is a lower bound on the WMSE of the CML estimator. The CML estimator performance is computed using 10,000 Monte-Carlo trials.

V-a Linear model with norm constraint

We consider the following linear observation model:

(52)

where is an observation vector, , , is a known full-rank matrix, is an unknown deterministic parameter vector, and is a zero-mean Gaussian noise vector with known covariance matrix . It is assumed that satisfies the norm constraint

(53)

where is known. This constraint arises, for example, in regularization techniques [58, 59]. The CML estimator of satisfies

(54)

where is given in (53). By using (53), we obtain and thus, satisfies

(55)

where (55) stems from (2)-(3). In this example, we are interested in the trace of the MSE matrix and choose .
Under the model in (52), it can be shown that the FIM is given by

(56)

By using (55) and orthogonal projection matrix properties, we obtain

(57)

By substituting (57) in (36), one obtains