# The exponential distribution analog of the Grubbs--Weaver method

Grubbs and Weaver (JASA 42 (1947) 224--241) suggest a minimum-variance unbiased estimator for the population standard deviation of a normal random variable, where a random sample is drawn and a weighted sum of the ranges of subsamples is calculated. The optimal choice involves using as many subsamples of size eight as possible. They verified their results numerically for samples of size up to 100, and conjectured that their "rule of eights" is valid for all sample sizes. Here we examine the analogous problem where the underlying distribution is exponential and find that a "rule of fours" yields optimality and prove the result rigorously.

## Authors

• 2 publications
• 1 publication
• ### A few properties of sample variance

A basic result is that the sample variance for i.i.d. observations is an...
09/11/2018 ∙ by Eric Benhamou, et al. ∙ 0

• ### Logarithm of ratios of two order statistics and regularly varying tails

Here we suppose that the observed random variable has cumulative distrib...
04/16/2019 ∙ by Pavlina K. Jordanova, et al. ∙ 0

• ### Cost Issue in Estimation of Proportion in a Finite Population Divided Among Two Strata

The problem of estimation of the proportion of units with a given attrib...
03/24/2019 ∙ by Dominik Sieradzki, et al. ∙ 0

• ### Unbiased estimators for random design regression

In linear regression we wish to estimate the optimum linear least square...
07/08/2019 ∙ by Michał Dereziński, et al. ∙ 4

• ### Regression and Classification by Zonal Kriging

Consider a family Z={x_i,y_i,1≤ i≤ N} of N pairs of vectors x_i∈R^d and ...
11/29/2018 ∙ by Jean Serra, et al. ∙ 0

• ### How to estimate the sample mean and standard deviation from the five number summary?

In some clinical studies, researchers may report the five number summary...
01/04/2018 ∙ by Jiandong Shi, et al. ∙ 0

• ### Measuring productivity dispersion: a parametric approach using the Lévy alpha-stable distribution

Productivity levels and growth are extremely heterogeneous among firms. ...
10/11/2019 ∙ by Jangho Yang, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Suppose

 X1,X2,…,Xniid∼f(x)

is a random sample of size where each

is a continuous random variable with density function

, and (unknown) standard deviation . Denote the order statistics as

 X1:n≤X2:n≤⋯≤Xn:n.

The range of the sample is

 Rn=Xn:n−X1:n

and the standardized sample range is

 Wn=Rnσ.

In the case where the sample is drawn from a normal population, the minimum variance unbiased estimator of is known to be the bias corrected sample standard deviation , with

 S=√∑ni=1(Xi−¯X)2n−1

and

 c=√2n−1Γ(n2)Γ(n−12),

where

 Γ(x)=∫∞0tx−1e−t dt

is Euler’s gamma function. On the other hand, an easily calculuated unbiased estimator for is

 RnE(Wn),

which of course has a comparatively large variance.

Grubbs and Weaver (1947) study a compromise between and in the case where the random sample is drawn from a normal population with mean and (unknown) variance . They partition the sample of size into subsamples of sizes respectively, where each , and . Then estimate by , where

 ^σ=m∑i=1aiRni (1.1)

with representing the range of the th subsample, which is of size (see Remark 2.1 below for clarification), and are a set of weights chosen to guarantee that will be unbiased. They use the term “group range” to mean the range of the subsample, and thus entitle their paper “The best unbiased estimate of population standard deviation based on group ranges.”

Recall that a partition of a positive integer is a representation of as an unordered sum of positive integers. Each summand is called a part of the partition, and the number of parts in a given partition is called its length. Since the order of the parts is irrelevant, it is often convenient to write the partition of length as where and . Thus, e.g., the five partitions of are as follows:

 (4)(3,1)(2,2)(2,1,1)(1,1,1,1).

Alternatively, we may define to denote the frequency (or multiplicity) of part in the partition , i.e. the number of times the part appears, and employ the “frequency superscript notation” . In this notation, the five partitions of are

 ⟨4⟩⟨1 3⟩⟨22⟩⟨12 2⟩⟨14⟩,

where we have followed the convention that the superscript is omitted and is omitted if .

We are interested in partitions of where all parts are at least , as a subsample of size will, by definition, have a range of . Let us call such a partition admissible. If denotes the number of unrestricted partitions of , it is well-known and easy to prove that the number of admissible partitions of is equal to . The number of admissible partitions of increases rather rapidly with ; e.g., . To get a rough idea of the size of for general , we mention in passing that the asymptotic formula

 P(n)∼π12⋅21/2 n3/2 eπ(2n/3)1/2 as n→∞ (1.2)

may be deduced from a theorem of Meinardus (1954). Equation (1.2) is analogous to the famous asymptotic formula of Hardy and Ramanujan (1918, p. 79, Eq. 1.41) for the unrestricted partition function

 p(n)∼14⋅31/2 n eπ(2n/3)1/2 as n→∞.

Associated with each admissible partition of is an estimator defined in (1.1); for a given , the admissible partition that corresponds to the of minimum variance will be called the optimal partition of .

Grubbs and Weaver (1947) performed extensive computations for , and showed that the optimal partition of , when the underlying distribution is normal, uses as many ’s as possible (with occasional ’s or ’s to adjust for the fact that not every is a multiple of ), except for sporadic exceptions that occur for small .  Grubbs and Weaver (1947) did not supply a rigorous proof that their assertions held for , perhaps owing to the lack of closed form expressions for the expected value and variance of the range when the underlying distribution is normal.

Here we investigate the analogous problem in the case where the underlying distribution is exponential.

## 2 Precise Statement of the Problem

Let

 X1,X2,…,Xniid∼Exp(θ);

i.e. each has pdf

 f(x;θ)=1θe−x/θ,

for and otherwise, where . Since , by estimating the population standard deviation , we are equivalently estimating the parameter .

###### Remark 2.1.

For a given partition of with length , we understand that the subsamples of the random variables are to be

 {X1,X2,…,Xn1},{Xn1+1,…,Xn1+n2},…,{Xn1+n2+⋯+nm−1+1,…,Xn1+n2+⋯+nm}.

For each admissible partition of , we have since each part must be at least , and because a partition of with no ’s clearly cannot have as a part.

Following the notation in Grubbs and Weaver (1947), let

 dn=E(Rn)σ

and

 k2n=E(Rn−dnσ)2σ2=Var(Rn)σ2.

It is well-known (see, e.g., David and Nagaraja (2003, p. 52, Ex. 3.2.1)) that

 dn=θ−1Hn−1,1 and k2n=θ−2Hn−1,2

where

 Hn,j=n∑i=11ij

are generalized harmonic numbers.

A linear combination of random variables which gives the minimum variance unbiased estimate has coefficients which are inversely proportional to the variances of the variables. Consequently, let

 ai=(dnik2ni)(m∑j=1d2njk2nj)−1. (2.1)
###### Proposition 2.2.

The estimator

 ^σ=m∑i=1aiRni,

where is defined in (2.1), is an unbiased estimator of .

###### Proof.
 E(^σ)=E(m∑i=1aiRni)=m∑i=1aiE(Rni)=m∑i=1aidniσ=σ.

Now we state the main theorem:

###### Theorem 2.3 (The Exponential “Rule of Fours”).

Fix an integer . Write , where , the least nonnegative residue of modulo . The optimal partition of is as follows:

• , if ;

• , if or and ;

• , if and ;

• , if ; and

• , if .

###### Example 2.4.

Suppose we have the sample size which implies and . Thus, according to Theorem 2.3, the optimal partition of is and this in turn gives

where for , denotes the range of the subsample of the sample .

## 3 Proof of Theorem 2.3

Next, Theorem 2.3

will be reformulated into an integer linear program. Upon solving the equivalent integer linear program, we will have proved Theorem

2.3.

### 3.1 Reformulation as an integer linear program

For any given positive integer , let us define

 Cj=d2jk2j=(E(Rj))2Var(Rj)=H2j−1,1Hj−1,2,

observing that is independent of . Thus, the are known absolute constants. For a given random sample of size , we seek a partition of such that is minimized when the sample is partitioned according to .

Notice that

 Var(^σ)=Var(m∑i=1aiRni)=m∑i=1a2iVar(Rni)=σ2∑mi=1(dnikni)2=σ2∑mi=1Cni. (3.1)

Thus we seek the partition of which causes the denominator in the rightmost expression of (3.1) to be as large as possible.

For any given , an optimal partition is given by a solution of the following integer linear program:

 maximize(C2f2+C3f3+⋯+Cnfn)

subject to

 2f2+3f3+4f4+⋯+nfn=n (EKP)
 f2,f3,f4,…,fn≥0
 f2,f3,f4,…,fn∈Z.

The objective function is the denominator in the far right member of (3.1). The constraints guarantee that we search only over admissible partitions of .

The program (EKP) is an instance of the equality-constrained “knapsack problem.” While integer linear programming problems are in general NP-complete, we will be able to exploit the special structure to provide an optimal solution to (EKP).

### 3.2 Statement and proof of the key lemma

But first, our solution is contingent upon the following lemma.

###### Lemma 3.1.

The function attains its maximum value at .

###### Remark 3.2.

The significance of maximizing function is as follows: given a sample of size that is partitioned into samples of size each, the denominator in the rightmost member of (3.1) is . The sample size is fixed, but we wish to find the part size that maximizes in order to maximize the denominator of the rightmost member of (3.1). Of course, this can only be done exactly in the case where the sample size is a multiple of the part size , accounting for the fact that while gives the optimal part size, for sample sizes that are not multiples of , the optimal partition includes some parts of size or .

###### Proof of Lemma 3.1.

We need to show that for ,

 Cnn=H2n−1,1nHn−1,2

We will utilize the elementary inequalities

 Hn,1<1+logn (3.2)

and

 Hn,2>1−1n+1. (3.3)

Inequality (3.2) may be deduced as follows: note that for . Then observe that in the integrand, thus and thus follows by summation. Inequality (3.3) follows from the fact that

 Hn,2=n∑j=11j2>∫n+111x2 dx=1−1n+1.

Applying (3.2) and (3.3), we obtain

 Cnn=H2n−1,1nHn−1,2<(1+log(n−1))2n(1−1n)=(1+log(n−1))2n−1. (3.4)

Let . Thus, is bounded above by . Let us now temporarily consider to be a real variable with domain . Notice that

 dhdn=ddn{(1+log(n−1))2n−1}=[1+log(n−1)][1−log(n−1)](n−1)2. (3.5)

Since and for , is negative whenever , i.e. when . Thus is decreasing for all integers . Now . Thus for all . That for and can be verified by direct computation. ∎

### 3.3 Solving (EKP) by the method of group relaxation

With Lemma 3.1 in hand, we now proceed to solve (EKP). Fix , with . We now closely follow the treatment of Lee (2004, §7.2, p. 184 ff) to solve (EKP).

In general, the first step is to seek an upper bound for

 maxn∑j=2Cjfj,

and then, if necessary, initiate a branch-and-bound procedure. In the case of interest, this initial step will be sufficient to find an optimal solution to (EKP). By Lemma 3.1 we have that

 C44=max{Cjj:2≤j≤n}.

Relax the nonnegativity restriction on , and solve for in terms of the other ’s, namely

 f4=n4−14∑2≤j≤nj≠4jfj, (3.6)

to obtain the “group relaxation”:

 nC44+max∑2≤j≤nj≠4(Cj−jC88)fj

subject to

 ∑2≤j≤nj≠4jfj=n−4f4; (GR)
 f2,f3,f5,f6,f7,…,fn≥0;
 f2,f3,f4,…,fn∈Z.

The group relaxation (GR) is in turn equivalent to:

 nC44−min∑2≤j≤nj≠4(−Cj+jC44)fj

subject to

 ∑2≤j≤nj≠4jfj≡r(mod4); (GR′)
 f2,f3,f5,f6,f7,f8,…,fn≥0;
 f2,f3,f4,…,fn∈Z.

We note that every feasible solution in (GR) corresponds to a value of in (GR), via Equation (3.6). If we are lucky, the optimal solution we find to (GR) yields an , and then we will have also found an optimal solution to (EKP). (If we are not lucky and the corresponding in (GR) is negative, then we must embark on a branch-and-bound procedure with potentially many iterations.) Here, however, we will show that for all , we are indeed lucky, and the optimal solution to (EKP) will indeed be found directly with no need to initiate branch-and-bound.

In order to solve (GR), we form a weighted directed multigraph as follows. Let the vertex set be given by . For each and , there is an edge of weight from to . Note well that in this notation, the “ending vertex” of edge is not but rather . We seek a minimum weight directed walk from to . Each time we include the edge in the diwalk, we increment by . Since for each , the edge weight , we may use the algorithm of Dijkstra (1959) to find a minimum weight dipath from to .

We first dispense with the trivial case . Here we immediately have the optimal solution (via the empty path from to )

 fj={qif j=40otherwise.

We now move on to the remaining cases . Notice that connecting any two vertices in there are multiple directed edges. For each , connecting vertex to vertex , we have the following edges: , where is the largest integer such that . All of these edges have different weights, and since we seek a minimum weight diwalk, for each , we may safely remove all but the one of lowest weight, resulting in a much less “cluttered” digraph . Thus is a digraph with four vertices and directed edges (as each of the vertices now has exactly one directed edge to each of the other vertices).

edge connecting vertices weight

Perform the algorithm of Dijkstra (1959) on the weighted directed graph to find that the minimum weight directed paths are as follows: from to using edges total weight resulting nonzero values and Thus we arrive at precisely the desired result.

Since the value of obtained from the group relaxation (GR) (see Eq. (3.6)) is always nonnegative, then group relaxation solves the original equality knapsack problem (EKP), and thus no branch-and-bound procedure need be undertaken.

With the solution of (EKP), we have proved Theorem 2.3 for . For , the theorem can be established by direct calculation. ∎

## 4 Conclusion

We have proved that an unbiased estimator from an exponential population formed by taking an appropriately weighted sum of the ranges of subsamples obtained by partitioning the original sample of size has minimal variance when the subsamples are each of size (or as close as possible when is not a multiple of ). This contrasts with the work of Grubbs and Weaver (1947), where the optimal subsample sizes are , when the population is normal. A similar analysis could be applied to estimate the standard deviation from populations with other distributions. For example, based on some preliminary calculations, in the case of the Rayleigh distribution, the optimal subsample sizes appear to be , and in the case of the distribution, the optimal subsample sizes appear to be . Another variant perhaps worth investigating is seeking the optimal partition of the sample where instead of the range of subsamples, we consider various quasi-ranges.

Harter and Balakrishnan (1996, pp. 9–11) compare the efficiency of estimators of based on quasi-ranges with the Grubbs–Weaver estimator for in the case of a normal parent population and remark [p. 11] that the asymptotic efficiency of the Grubbs–Weaver estimator is 75.38 percent. In future work we would like to obtain analogous results for various non-normal parent distributions, including the exponential distribution that was studied in this present paper.

## Acknowledgments

The authors thank the Editor and the anonymous referee for helpful suggestions that improved the paper.

## References

• David and Nagaraja (2003) David, H. A. and Nagaraja, H. N., 2003. Order Statistics

, 3rd ed., Wiley Interscience. Wiley Series in Probability and Statistics.

• Dijkstra (1959) Dijkstra, E. W., 1959, A note on two problems in connexion with graphs, Numerische Mathematik 1, 269–271.
• Grubbs and Weaver (1947) Grubbs, F. E. and Weaver, C. L., 1947, The best unbiased estimate of population standard deviation based on group ranges, J. Amer. Stat. Assoc. 42, 224–241.
• Hardy and Ramanujan (1918) Hardy, G. H. and Ramanujan, S., 1918, Asymptotic formulæ in combinatory analysis, Proc. London Math. Soc. 17, 75–15.
• Harter and Balakrishnan (1996) Harger, H. Leon and Balakrishnan, N., 1996, CRC Handbook of Tables for the Use of Order Statistics in Estimation, CRC Press.
• Lee (2004) Lee, J., 2004. A First Course in Combinatorial Optimization, Cambridge Texts in Applied Mathematics, Book 36. Cambridge University Press.
• Meinardus (1954) Meinardus, G., 1954, Asymptotische Aussagen über Partitionen, Math. Z. 61, 289–302.