# Median Confidence Regions in a Nonparametric Model

The problem of constructing confidence regions for the median in the nonparametric measurement error model (NMEM) is considered. This problem arises in many settings, including inference about the median lifetime of a complex system arising in engineering, reliability, biomedical, and public health settings. Current methods of constructing CRs are discussed, including the T-statistic based CR and the Wilcoxon signed-rank statistic based CR, arguably the two default methods in applied work when a confidence interval about the center of a distribution is desired. Optimal equivariant CRs are developed with focus on subclasses of of the class of all distributions. Applications to a real car mileage efficiency data set and Proschan's air-conditioning data set are demonstrated. Simulation studies to compare the performances of the different CR methods were undertaken. Results of these studies indicate that the sign-statistic based CR and the optimal CR focused on symmetric distributions satisfy the confidence level requirement, though they tended to have higher contents; while two of the bootstrap-based CR procedures and one of the developed adaptive CR tended to be a tad more liberal but with smaller contents. A critical recommendation is that, under the NMEM, both the T-statistic based and Wilcoxon signed-rank statistic based confidence regions should not be used since they have degraded confidence levels and/or inflated contents.

## Authors

• 5 publications
• 3 publications
02/16/2021

### Distribution-Free Conditional Median Inference

We consider the problem of constructing confidence intervals for the med...
05/30/2018

### Note on the robustification of the Student t-test statistic using the median and the median absolute deviation

In this note, we propose a robustified analogue of the conventional Stud...
07/22/2018

### Sign-Perturbed Sums: A New System Identification Approach for Constructing Exact Non-Asymptotic Confidence Regions in Linear Regression Models

We propose a new system identification method, called Sign-Perturbed Sum...
03/04/2019

### Nonparametric Confidence Regions for Level Sets: Statistical Properties and Geometry

This paper studies and critically discusses the construction of nonparam...
06/08/2021

### Sensitivity analysis for random measurement error using regression calibration and simulation-extrapolation

Sensitivity analysis for measurement error can be applied in the absence...
07/01/2020

### Construction of confidence interval for a univariate stock price signal predicted through Long Short Term Memory Network

In this paper, we show an innovative way to construct bootstrap confiden...
03/18/2021

### Confidence Regions Near Singular Information and Boundary Points With Applications to Mixed Models

We propose confidence regions with asymptotically correct uniform covera...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction and Motivation

Given a univariate distribution function , the two most common measures of central tendency are the mean , provided it exists (), and the median . The mean need not always exist, whereas the median always exists. Under symmetric distributions, and when the mean exists, then the mean and the median coincide. This paper is concerned with making statistical inferences about the median of a distribution. A popular model leading to the problem of making inference about the median of a distribution is the so-called measurement error model. In this model represents a quantity of interest which is unknown, and when one measures its value, the observed value

is a realization of the random variable

 X=Δ+ϵ, (1)

where represents a measurement error with a continuous distribution whose median equals zero. As such, the distribution of is . Typically,

is assumed to be a zero-mean normal distribution, but this assumption is not tenable in many situations. For instance, in dealing with event times in biomedical, reliability, engineering, economic, and social settings, the error distribution need not even be symmetric. This is also the case when dealing with economic indicators such as per capita income, retirement savings, etc. As such, a general model is to simply assume that the error distribution

belongs to the class of all continuous distributions with medians equal to zero. This class will be denoted by .

Another situation where this problem arises is when dealing with a complex engineering system, such as the motherboard of a laptop computer or some technologically-advanced car (e.g., a Tesla Model S sedan). Such a system will be composed of many different components configured according to some structure function, with the components having different failure-time distributions and some of them possibly acting dependently on each other. Of main interest for such a system will be its time-to-failure (also called lifetime) denoted by . Because of the complexity of the system, it may not be feasible to analyze the distribution of by taking into account each of the failure time distributions of the components and the system’s structure function which represents the configuration of the components to form the system. Thus a simplified and practically feasible viewpoint is to assume that the system’s life distribution is some continuous distribution . One may then be interested in the median of this distribution .

Thus, in these situations, the observable random variable is assumed to have a distribution with and being the median of . This will be referred to as the one-population nonparametric measurement error model, abbreviated NMEM. This is the simplest among the measurement error models. The goal is to infer about the parameter of interest with acting as an infinite-dimensional nuisance parameter. We shall be interested in this paper in the construction of a confidence region (CR) for based on a random sample of observations of . This definitely is a classic problem since the construction of a confidence interval for the median was even discussed in ([14]

). More generally, quantiles instead of just the median may be of interest, and the methods developed here could be adaptable to inference about general quantiles.

Arguably, confidence regions for a parameter are preferable than point estimates since they address simultaneously the issue of how close to the truth (measured through the content of the region) and how sure about such closeness to the truth (measured by the confidence region coefficient). For more discussions on desirability of confidence regions see, for instance, the introduction in

[3] and chapter 5 in [2]

. Of course, one could typically accompany a point estimate (PE) by an estimate of its standard error (ESE), but then the user still needs to deduce closeness and sureness based on the PE and the ESE, usually a non-trivial matter if to be done

properly.

We introduce some notations and definitions. Let be independent and identically distributed (IID) random variables (a random sample) from , where and . The mathematical problem is to construct a confidence region (CR) for the parameter with an infinite-dimensional nuisance parameter. Denote by the range space of which will be endowed with a -field . We also denote by the Borel -field of , and this will be endowed with the -field of subsets of consisting of its countable and co-countable subsets, with this -field denoted by .

###### Definition 1

Fix an . Let be IID from . A measurable mapping is called a region estimator or confidence region (CR) for if for every .

Remark: In later developments, we will allow the CR to also depend on a randomizer , a standard uniform random variable independent of . This is to be able to achieve exactly the desired confidence level . In such a case, and will be the realized CR when and . However, even if we allow for randomized CRs, we will usually suppress writing the in and simply write .

Aside from satisfying the desired confidence coefficient in Definition 1, the quality of a CR depends on some measure of its content. Let be Lebesgue measure on . We will measure the content of a CR for via

 C[Γ;(F(⋅),Δ)]=E(F(⋅),Δ)ν[Γ(X)]. (2)

In Definition 2 below we have the notion of uniformly best CRs. Our goal is to determine those CRs for that possess such optimality properties.

###### Definition 2

Let be a subclass of . A CR for is a uniformly best CR for under the subclass if for any other CR ,

 C[Γ∗;(F(⋅),Δ)]≤C[Γ;(F(⋅),Δ)]

for all . If , then will be said to be the uniformly best CR for .

The major contribution of this work is the development of randomized region estimators or confidence regions, possibly approximate, for the median under the NMEM, of form

 Γ(X,U) = ⎡⎢ ⎢⎣⋃[{k∈{0,1,…,n}: b(k;n,1/2)>c∗^l(k)}][X(k),X(k+1))⎤⎥ ⎥⎦⋃ {U≤γ}⎡⎢⎣⋃{k∈{0,1,…,n}: b(k;n,1/2)=c∗^l(k)}[X(k),X(k+1))⎤⎥⎦,

where and is an appropriate estimator of . The randomizer is a uniform random variable, while is the infimum over all satisfying where is a binomial random variable with parameters and . A specific form of that leads to a reasonable CR is given by

 ^l(k)=(nk)∫∞−∞^F(w)k[1−^F(w)]n−kdw

where , the empirical distribution function of , with being the sample median. The specific CR above will be developed in section 5. Prior to the development of the specific CRs, in section 2 we utilize invariance ideas to derive the general form of the almost-optimal equivariant CR for under the NMEM but still under the assumption that is known. Then, we address the question of how to deal with the fact that

is not actually known, leading to the region estimator above. Two other region estimators which are focused toward the class of symmetric distributions and the class of negative exponential distributions, but still valid for the general NMEM model, will be developed in sections

3 and 4

, respectively. In the simulation studies, the procedure focused on symmetric distributions actually performed quite robustly under varied distributions (even for the non-symmetric distributions) in terms of coverage probability and it had mean content superior to the procedure based on the sign statistic. Prior to studying the performance of these new region estimators, we briefly describe existing (‘off-the-shelf’) region estimators for the median in section

6. We then proceed to demonstrate these different region estimators by applying to two data sets in section 7. Section 8 will present the results of simulation studies comparing the performances of these region estimators under different underlying distributions by examining their mean contents and their achieved confidence levels. These comparisons are also of major importance since they demonstrate CR procedures that should be preferred and which CRs should not be used under the NMEM . Section 9 will provide some concluding remarks.

## 2 Development of Optimal CRs

### 2.1 Invariant Models and Equivariant CRs

We first review the notions of invariant statistical models and equivariant CRs (see, for instance, [9]). We do this review in a more general framework than the concrete NMEM which is the focus of this paper. We note that sufficiency and invariance were major ideas utilized by Peter Hooper in several of his papers dealing with confidence sets and prediction sets, cf., [7, 6, 8].

Let be an observable random element taking values in a sample space . The class of probability models governing is which consists of probability measures ’s on the measurable space , with a suitable -field of subsets of . Let be a functional, with being the parameter of interest. A confidence region for is a set-valued mapping , where is a class of subsets of such that

 P{τ(P)∈Γ(X)}≥1−α, ∀P∈P. (3)

Let be a family of transformations on that forms a group under an operation and with identity element . Let be a group of transformations on such that there exists a homomorphism and let be the identity element in . The statistical model is said to be -invariant if

 P{gX∈A}=¯gP{X∈A},∀g∈G;A∈F. (4)

In addition, let be a group of transformations on such that there exists a homomorphism . The parametric functional is said to be -equivariant if for all

. Employing a decision-theoretic framework, define a loss function on

given by the loss function

 L(τ,C)=1−I{τ∈C}.

We shall say the the loss function is -invariant if for every , , and . Given a confidence region , its risk function is

 R(P,Γ)≡EP{L(τ(P),Γ(X))}=1−P{τ(P)∈Γ(X)}.

As such, the condition for a confidence region is equivalent to having for every . When a -invariant statistical model is coupled with a -invariant loss function, then we would say that the statistical problem of constructing a confidence region is -invariant. A confidence region is then said to be -equivariant if for every and , we have that

 Γ(gx)=~gΓ(x)≡{~gt:t∈Γ(x)}.

The Principle of Invariance then dictates that we should only utilize -equivariant confidence regions.

For an invariant confidence region problem, if is equivariant, then we have that, for every ,

 P{τ(P)∈Γ(X)} = EP{1−L(τ(P),Γ(X))}=EP{1−L(~gτ(P),~gΓ(X))} = EP{1−L(τ(¯gP),Γ(gX)}=E¯gP{1−L(τ(¯gP),Γ(X))} = (¯gP){τ(¯gP)∈Γ(X)}.

Furthermore, if the group is transitive over , meaning that for any given we have , then it suffices to consider an arbitrary element to determine for all since this equals the value using the arbitrary .

Recall that we also need to measure the quality of a confidence region by measuring its content using the quantity where is a measure on , e.g., Lebesgue measure. We seek those confidence regions with small . Observe that for an equivariant in an invariant statistical model, we have for every that

 C(P,Γ) = EP[ν(Γ(X))]=EP[ν(~g−1Γ(gX))]=E¯gP[ν(~g−1Γ(X))].

If it so happens that for all and and for some , then there is the possibility of finding a that satisfies the required confidence level and minimizes the content. We shall call this condition as quasi-invariance of with respect to . However, if -quasi-invariance of does not hold, then a uniformly best confidence region may not exist. But, a uniformly best confidence region on a subfamily may still exist among the class of confidence regions over . In [7, 6] quasi-invariance of the measure was imposed, but in some settings this may be unnatural such as in the NMEM under consideration in the current paper.

### 2.2 Towards Optimal CRs for the Median

Consider now the problem of constructing a CR for the median under the NMEM:

Prior to invoking invariance, we first reduce via the Sufficiency Principle. Thus, we may assume that the observable random vector is

, the vector of order statistics which is a complete sufficient statistic. The appropriate sample space is therefore . A word on our notation: even though we had reduced to , in the sequel, when we write or , this means that the common distribution of the original ’s is . For measuring the content of a region for we use Lebesgue measure on .

The first invariance reduction is obtained through location-invariance. The problem is invariant with respect to translations with the groups of transformations being, for every ,

 x()↦x()+c; (F,Δ)↦(F,Δ+c); and θ↦θ+c.

A CR is location-equivariant if, for every , where . Observe that for a location-equivariant , we have for every that

 P(F,Δ){Δ∈Γ(X())}=P(F,Δ){Δ∈Γ(X()+c)−c} = P(F,Δ){Δ+c∈Γ(X()+c)}=P(F,Δ+c){Δ+c∈Γ(X())} = P(F,0){0∈Γ(X())}

by taking to obtain the last equality. The problem has thus been reduced to considering to be the order statistics from and we seek a location-equivariant such that, for every , In addition, we seek to minimize over all . Note that Lebesgue measure in is location-invariant, that is, for every and .

We remark at this stage that if we know the distribution , then we could determine the optimal CR for under this known distribution and no further invariance reduction will be needed. To demonstrate, suppose that

is the normal distribution with mean zero and variance

which could be taken to be , so , with the standard normal distribution function (we also let to denote the standard normal density function). Then, we seek a location-equivariant satisfying and with minimized. Under , the joint density function of is given by . Thus, we want On the other hand, we obtain

 EΦ∫RI{w∈Γ∗(X())}dw=∫X∫RI{w∈Γ∗(x())}f(x())dwdx() = ∫X∫RI{0∈Γ∗(x()−w)}f(x())dwdx()=∫XI{0∈Γ∗(x())}h(x())dx()

where

 h(x())=n!(2π)−(n−1)/2√nexp{−12n∑i=1(x(i)−¯x)2}

obtained after the obvious change-of-variables. The problem is then to find a location-equivariant that will minimize subject to the condition that The solution to this constrained minimization problem (see the optimization result in Theorem 2) is the well-known -confidence interval for the normal mean given by

However, since is known only to belong to , a further invariance reduction is needed. This is achieved through strictly increasing continuous transformations with as a fixed point. Let denote the collection of functions that are strictly increasing continuous function on with . The groups of transformations are given by

 x()↦(m(x(1)),m(x(2)),…,m(x(n))); F↦Fm−1; and θ↦m(θ).

is then equivariant with respect to these groups of transformations if

 Γ(m(x(1)),…,m(x(n)))=mΓ(x(1),…,x(n))≡{m(w):w∈Γ(x())},

so that for every and , we have We then have that, for every and ,

 PF{0∈Γ(X())} = PF{0∈Γ(m−1m(X()))}=PF{0∈m−1Γ(m(X()))} = PF{0∈Γ(m(X())) since m(0)=0 = PFm−1{0∈Γ(X())}.

Observe, however, that

 EFν[Γ(X())]=EFν[m−1Γ(m(X()))]=EFm−1ν[m−1Γ(X())]

and we do not have in this situation quasi-invariance of the measure with respect to the groups of monotone transformations.

The group of transformations with is transitive over . Thus, we may simply pick an arbitrary , which could be taken to be

, the uniform distribution over

. Indeed, if , then with , we have . Thus,

 PF{0∈Γ(X())}=PF0{0∈Γ(X())} and EFν[Γ(X())]=EF0ν[m−1Γ(X())].

We emphasize again that in the second equation we could not drop the term nor factor it out from inside the measure. This will prevent us from obtaining a uniformly (over ) best confidence region for .

Next, we obtain a representation of by choosing a specific member of that depends on . For an , define for via

 m(x())(w)=n∑i=1I{x(i)−x(n)≤w}−n,

and for define it such that it is strictly increasing and continuous over all . Observe that for ,

 m(x())(x(j)−x(n))=j−nandm(x())(−x(n))=B(x())−n,

where . Note that and observe that

###### Lemma 1

With defined as above, a location-equivariant (LE) and -equivariant (ME) has representation

 Γ(x())=m(x())−1[Γ0−n]+x(n) (5)

where is some region in . Thus, the LE and ME ’s are determined by ’s, which are subsets of . In fact, given a , we have

 Γ(x())=⋃k∈[Γ0∩{0,1,…,n}][x(k),x(k+1)) (6)

whose Lebesgue measure is

Proof: We utilize the location-equivariance (LE) and -equivariance (ME) of . We have

 Γ(x()) = Γ(x()−x(n))+x(n) (by LE) = m(x())−1Γ(m(x())(x()−x(n)))+x(n) (% by ME) = m(x())−1Γ(1−n,2−n,…,(n−1)−n,n−n)+x(n) = m(x())−1[Γ(1,2,…,n−1,n)−n]+x(n) (again% , by LE) = m(x())−1[Γ0−n]+x(n),

where . To establish (6), given a , observe that

 {0∈Γ(x())} ⟺ {0∈m(x())−1[Γ0−n]+x(n)} ⟺ {m(x())(−x(n))∈Γ0−n} ⟺ {(B(x())−n)∈(Γ0−n)} ⟺ {B(x())∈Γ0}.

It now follows that

 {w∈Γ(x())} ⟺ {0∈Γ(x()−w)} [by LE property] ⟺ {B(x()−w)∈Γ0} [preceding result] ⟺ ⎧⎨⎩w∈⋃k∈Γ0∩{0,1,…,n}{v:B(x()−v)=k}⎫⎬⎭ ⟺ ⎧⎨⎩w∈⋃k∈Γ0∩{0,1,…,n}[x(k),x(k+1))⎫⎬⎭.

Thus, given a , establishing (6). The last result about the Lebesgue measure of is immediate since the intervals are disjoint.

### 2.3 Optimal CRs

Next, we tackle the problem of choosing an ‘optimal’ (properly defined) region , which then determines via the representation in Lemma 1. Recall that the goal is to find such that with , and, for every , minimized, or if this is not possible, made small. Note that under ,

has a binomial distribution with parameters

, denoted by , and with associated probability mass function We have that

 1−α≤PF0{0∈Γ(X())}=PF0{B∈Γ0}=n∑k=0I{k∈Γ0}b(k;n,1/2)=n∑k=0δ0(k)b(k;n,1/2)

with the first equality obtained using the portion of the proof of Lemma 1 and where we define The expected Lebesgue measure of is

 EFν[Γ(X())] = n∑k=0I{k∈Γ0}[EF(X(k+1))−EF(X(k))]=n∑k=0δ0(k)l(k;F)

where, with and ,

 l(k;F)=EF(X(k+1))−EF(X(k)),k=0,1,…,n. (7)

Assume first that we know the values of . We now allow for randomized confidence regions in order to achieve optimality, that is, we allow for and to depend on a randomizer which is a standard uniform random variable independent of the ’s. We remark that in [7, 6, 8] randomized procedures were also allowed to enable achieving optimality, similarly to the Neyman-Pearson theory of most powerful tests (cf., [9]).

Define the right-continuous non-decreasing -valued function, for ,

 G(t) = P{b(B;n,1/2)>tl(B;F)}=n∑k=0I{b(k;n,1/2)>tl(k;F)}b(k;n,1/2).

For a given , define

 c=inf{t:G(t)≤1−α}andγ=(1−α)−G(c)G(c−)−G(c).

Define the function over via

 δ∗0((k,u)) = I{b(k;n,1/2)>cl(k;F)}+I{b(k;n,1/2)=cl(k;F)}I{u≤γ}.

The optimal then satisfies

###### Theorem 2

Let and let be as defined in (7). Then Furthermore, if is any other -valued function in with then

 E{n∑k=0[δ∗0(k,U)l(k;F)]}≤E{n∑k=0[δ0(k,U)l(k;F)]},

where the expectation is with respect to the randomizer .

Proof: From the form of , we have

 EF[δ∗0(B,U)] = P{b(B;n,1/2)>cl(B;F)}+γP{b(B;n,1/2)=cl(B;F)} = G(c)+[(1−α)−G(c)G(c−)−G(c)][G(c−)−G(c)]=1−α.

Let be any other function on with From the definition of , we observe that for each ,

 [b(k;n,1/2)−cl(k;F)][δ∗0(k,u)−δ0(k,u)]≥0.

Summing over and integrating over , we find that

 EF{δ∗0(B,U)−δ0(B,U)}≥c[n∑k=0∫10l(k;F)δ∗0(k,u)du−n∑k=0∫10l(k;F)δ0(k,u)du].

Since and by condition we have , then

 n∑k=0∫10l(k;F)δ∗0(k,u)du≤n∑k=0∫10l(k;F)δ0(k,u)du

which completes the proof of the theorem.

Remark: We note that this proof is similar to that of the Neyman-Pearson Lemma except for the fact that the is not a distribution function.

Therefore, the optimal , possibly using a randomizer , is

 Γ∗0(u) = [{k∈{0,1,…,n}:b(k;n,1/2)>cl(k;F)}]⋃ [{u≤γ}∩{k∈{0,1,…,n}:b(k;n,1/2)=cl(k;F)}].

The associated optimal confidence region for , possibly randomized, is

 Γ∗(X(),U)=⎡⎣⋃{k∈{0,1,…,n}: b(k;n,1/2)>cl(k;F)}[X(k),X(k+1))⎤⎦⋃ (8) ⎡⎣{U≤γ}∩